Image processing apparatus, system, image processing method, and image processing program
Abstract
An image processing apparatus includes a first reception section, a second reception section, an association processing section, an object detection section, and a process execution section. The first reception section receives image information acquired by an image sensor. The second reception section receives sound information that is acquired by one or plural directional microphones and that is generated for at least a partial region in a field of the image sensor. The association processing section associates the sound information with a pixel address of the image information indicating a position in the field. The object detection section detects, from the image information, at least a part of an object that is present in the field. The process execution section executes a predetermined process on the object on the basis of a result of the association performed by the association processing section.
Claims
exact text as granted — not AI-modifiedThe invention claimed is:
1. An image processing apparatus comprising:
a first reception section configured to receive image information acquired by an image sensor;
a second reception section configured to receive sound information that is acquired by one or plural directional microphones and that is generated for at least a partial region in a field of the image sensor;
an association processing section configured to associate the sound information with a pixel address of the image information indicating a position in the field;
an object detection section configured to detect, from the image information, at least a part of an object that is present in the field; and
a process execution section configured to execute a predetermined process on the object on a basis of a result of the association performed by the association processing section,
wherein the predetermined process is a function of whether or not the position in the field at which the sound information is associated overlaps with an area within the field at which the object is located, and
wherein the process execution section operates to distinguish between determinations that:
(i) the object appears to be moving and is actually moving in the field based on a determination that the sound information indicates an acoustic level above a predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located; and
(ii) the object appears to be moving and is not actually moving in the field based on a determination that the sound information indicates an acoustic level not above the predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located.
2. The image processing apparatus according to claim 1 , wherein
the second reception section receives the sound information associated with position information indicating the position in the field, and
the association processing section associates the sound information with the pixel address by using a result of calibration on the position information and the pixel address.
3. The image processing apparatus according to claim 1 , wherein
the association processing section associates the sound information with the pixel address corresponding to a region in an image in which the object that has been detected by the object detection section is present, and
the process execution section executes the predetermined process on the object corresponding to the pixel address with which the sound information has been associated.
4. The image processing apparatus according to claim 1 , wherein
the object detection section detects the object in a region in an image that is determined according to the sound information associated with the pixel address, and
the process execution section executes the predetermined process on the object that has been detected by the object detection section.
5. The image processing apparatus according to claim 3 , further comprising:
an object classification section that classifies the object detected by the object detection section into a first object or a second object, according to the sound information associated with the pixel address corresponding to the region in the image in which the object is present, wherein
the process execution section executes the predetermined process on the object that is classified as the first object.
6. The image processing apparatus according to claim 1 , wherein
the sound information includes information indicating whether or not sound has been detected, and
the process execution section executes the predetermined process on the object detected with the pixel address associated with the sound information indicating that sound has been detected.
7. The image processing apparatus according to claim 6 , wherein the process execution section executes a tracking process with use of a result of time-series detection of the object.
8. The image processing apparatus according to claim 1 , wherein
the sound information includes information indicating whether or not sound has been detected, and
the process execution section executes the predetermined process on the object detected with the pixel address that has been associated with the sound information indicating that no sound has been detected or that has not been associated with the sound information indicating that sound has been detected.
9. The image processing apparatus according to claim 8 , wherein the process execution section executes a self-position estimation process for an apparatus on which the image sensor is mounted, with use of a result of time-series detection of the object.
10. The image processing apparatus according to claim 8 , wherein the process execution section executes a motion cancelling process on an apparatus on which the image sensor is mounted, with use of a result of time-series detection of the object.
11. The image processing apparatus according to claim 1 , wherein the process execution section extracts, from the image information, image information including only the object.
12. The image processing apparatus according to claim 1 , wherein
the image sensor is an event-driven vision sensor that generates an event signal upon detecting an intensity change in incident light, for each pixel, and
the image information includes the event signal.
13. A system comprising:
an image sensor configured to acquire image information;
one or plural directional microphones configured to acquire sound information that is generated for at least a partial region in a field of the image sensor; and
a terminal apparatus including
a first reception section configured to receive the image information,
a second reception section configured to receive the sound information,
an association processing section configured to associate the sound information with a pixel address of the image information indicating a position in the field,
an object detection section configured to detect, from the image information, at least a part of an object that is present in the field, and
a process execution section configured to execute a predetermined process on the object, on a basis of a result of the association performed by the association processing section,
wherein the predetermined process is a function of whether or not the position in the field at which the sound information is associated overlaps with an area within the field at which the object is located, and
wherein the process execution section operates to distinguish between determinations that:
(i) the object appears to be moving and is actually moving in the field based on a determination that the sound information indicates an acoustic level above a predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located; and
(ii) the object appears to be moving and is not actually moving in the field based on a determination that the sound information indicates an acoustic level not above the predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located.
14. An image processing method comprising:
receiving image information acquired by an image sensor;
receiving sound information that is acquired by one or plural directional microphones and that is generated for at least a partial region in a field of the image sensor;
associating the sound information with a pixel address of the image information indicating a position in the field;
detecting, from the image information, at least a part of an object that is present in the field; and
executing a predetermined process on the object on a basis of a result of the association,
wherein the predetermined process is a function of whether or not the position in the field at which the sound information is associated overlaps with an area within the field at which the object is located, and
wherein the executing includes distinguishing between determinations that:
(i) the object appears to be moving and is actually moving in the field based on a determination that the sound information indicates an acoustic level above a predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located; and
(ii) the object appears to be moving and is not actually moving in the field based on a determination that the sound information indicates an acoustic level not above the predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located.
15. A non-transitory, computer readable storage medium containing a program, which when executed by a computer, causes the computer to perform an image processing method by carrying out actions, comprising:
receiving image information acquired by an image sensor;
receiving sound information that is acquired by one or plural directional microphones and that is generated for at least a partial region in a field of the image sensor;
associating the sound information with a pixel address of the image information indicating a position in the field;
detecting, from the image information, at least a part of an object that is present in the field; and
executing a predetermined process on the object on a basis of a result of the association,
wherein the predetermined process is a function of whether or not the position in the field at which the sound information is associated overlaps with an area within the field at which the object is located, and
wherein the executing includes distinguishing between determinations that:
(i) the object appears to be moving and is actually moving in the field based on a determination that the sound information indicates an acoustic level above a predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located; and
(ii) the object appears to be moving and is not actually moving in the field based on a determination that the sound information indicates an acoustic level not above the predetermined threshold and the position in the field that is associated overlaps with the area within the field at which the object is located.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.