US11636652B2ActiveUtilityPatentIndex 95

Periocular and audio synthesis of a full face image

Assignee: MAGIC LEAP INCPriority: Nov 11, 2016Filed: Nov 5, 2021Granted: Apr 25, 2023

Est. expiryNov 11, 2036(~10.4 yrs left)· nominal 20-yr term from priority

G06F 3/012G06F 3/013G06V 40/171G06T 13/40G06V 40/169G06T 17/20G10L 2015/025G02B 27/0172G02B 2027/0138G10L 21/10G02B 27/017G02B 27/0093G02B 2027/0178G06V 40/193G02B 2027/014G06F 3/011G10L 2021/105G06T 19/006G06F 2203/011

PatentIndex Score

Cited by

104

References

Claims

Abstract

Systems and methods for synthesizing an image of the face by a head-mounted device (HMD) are disclosed. The HMD may not be able to observe a portion of the face. The systems and methods described herein can generate a mapping from a conformation of the portion of the face that is not imaged to a conformation of the portion of the face observed. The HMD can receive an image of a portion of the face and use the mapping to determine a conformation of the portion of the face that is not observed. The HMD can combine the observed and unobserved portions to synthesize a full face image.

Claims

exact text as granted — not AI-modified

What is claimed is:

1. A computing system comprising:
a hardware processor programmed to:
access an image of a periocular region of a user wherein features of a lower portion of the users&#39; face are unobservable in the image;
generate, based at least partly on the image of the periocular region of the user, periocular face parameters encoding a periocular conformation of at least the periocular region of the user;
access a base model that was generated using images associated with a group of people not including the user;
customize a mapping based at least in part on the base model and the image of the periocular region of the user,
wherein an input of the mapping comprises at least the image of the periocular region of the user, and
wherein an output of the mapping comprises lower face parameters that encode a conformation of the lower face of the user that are deduced from an analysis of the image of the periocular region of the user;
apply the mapping to the image of the periocular region of the user to generate the lower face parameters; and
combine the periocular face parameters and the lower face parameters to generate full face parameters associated with a three-dimensional (3D) face model.

2. The computing system of claim 1 , wherein the 3D face model comprises a deformable linear model and wherein the periocular face parameters and the lower face parameters describe a deformation of the face when the user is speaking.

3. The computing system of claim 2 , wherein to generate the full face parameters, the hardware processor is programmed to update the 3D face model to reflect an update to at least one of the lower face parameters or the periocular face parameters.

4. The computing system of claim 1 , wherein the input of the mapping further comprises at least one of eye specific information, a body movement, or a heart rate.

5. The computing system of claim 4 , wherein the eye specific information comprises at least one of: an eye pose, a pupil dilation state, an eye color, or an eyelid state of the user.

6. The computing system of claim 1 , wherein the lower face parameters encode visemes which visually describe phonemes in an audio stream.

7. The computing system of claim 1 , wherein to customize the mapping, the hardware processor is programmed to infer a skin texture of the face of the user based at least partly on the image of the periocular region.

8. The computing system of claim 1 , further comprising:
an imaging system comprising an eye camera, wherein the image of the periocular region acquired by the imaging system comprises an image of the periocular region for a first eye.

9. The computing system of claim 8 , wherein to generate the full face parameters, the hardware processor is programmed to:
determine periocular face parameters for a second eye based on the image of the periocular region acquired by the imaging system; and
incorporate the periocular face parameters for the second eye into the full face parameters.

10. The computing system of claim 1 , wherein the base model comprises a 3D deformable linear model generated based on images of the group of people.

11. A method comprising:
accessing an image of the periocular region of a user acquired by an imaging system comprising one or more cameras positioned such that a periocular region of the user&#39;s face is observable by the imaging system and the user&#39;s lower face is unobservable by the imaging system;
determining, based at least partly on the image, periocular face parameters encoding a periocular conformation of at least the periocular region of the user;
accessing a base model that was generated using images associated with a group of people not including the user;
customizing a mapping based at least in part on the base model and the image of the periocular region of the user,
wherein an input of the mapping comprises at least the image of the periocular region of the user, and
wherein an output of the mapping comprises lower face parameters that encode a conformation of the lower face of the user, and that are deduced from an analysis of the image of the periocular region of the user;
applying the mapping to the image to generate the lower face parameters;
combining the periocular face parameters and the lower face parameters to generate full face parameters associated with a three-dimensional (3D) face model; and
generating a full face image based at least partly on the full face parameters.

12. The method of claim 11 , wherein the 3D face model comprises a deformable linear model and wherein the periocular face parameters and the lower face parameters describe a deformation of the face when the user is speaking.

13. The method of claim 12 , further comprising updating the 3D face model to reflect an update to at least one of the lower face parameters or the periocular face parameters.

14. The method of claim 11 , wherein the full face parameters are combined with eye specific information to determine an animation associated with the user&#39;s face.

15. The method of claim 14 , wherein the eye specific information comprises at least one of an eye pose, a pupil dilation state, an eye color, or an eyelid state of the user.

16. The method of claim 11 , wherein the full face image further incorporates skin textures of the user which are determined based at least partly on the image acquired by the imaging system.

17. The method of claim 11 , wherein the mapping comprises a likelihood that a periocular face parameter is associated with a lower face parameter, and the lower face parameters are selected to generate the full face image in response to a determination that they pass threshold criteria.

18. The method of claim 11 , wherein the image comprise at least one of a still image or a video frame.

19. The method of claim 11 , wherein the base model comprises a 3D deformable linear model generated based on images of the group of people.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.