System and method for autonomous navigation using visual sparse map
Abstract
A system and method for autonomous navigation using a visual sparse map. The system includes a robotic device having an RGB-D camera, a processor and a storage device storing computer executable code. The computer executable code is configured to: obtain the visual sparse map based on captured RGB-D images; capture an RGB image; acquire a current pose of the robotic device; find a keyframes nearest to the current pose of the robotic device; find a target waypoint that is ahead of the nearest keyframe at about a pre-defined distance; compute transition velocity and rotation velocity of the robotic device based on relative location between the robotic device and the target waypoint; and control operation of the robotic device using the computed transition velocity and rotation velocity to achieve autonomous navigation.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A system for autonomous navigation using a visual sparse map, comprising a robotic device, wherein the robotic device comprises a visual sensor, a processor and a storage device storing computer executable code, and the computer executable code, when executed at the processor, is configured to:
obtain the visual sparse map, the visual sparse map comprising a plurality of visual feature points and a plurality of keyframes;
capture an image using the visual sensor;
acquire a current pose of the robotic device based on the image and the visual sparse map;
find a nearest keyframe, wherein the nearest keyframe is one of the plurality of keyframes that is nearest to the current pose of the robotic device;
find a target waypoint, wherein the target waypoint is one of the plurality of keyframes that is ahead of the nearest keyframe and has a pre-determined distance to the nearest keyframe;
compute transition velocity and rotation velocity of the robotic device based on relative location between the robotic device and the target waypoint; and
control operation of the robotic device using the computed transition velocity and rotation velocity to achieve autonomous navigation,
wherein the plurality of keyframes are listed as {x 0 , x 1 , . . . , x k-1 , x k }, x k is defined by three dimensional coordinates of the k th keyframe, k is an index of x k , and the target waypoint has an index greater than an index of the nearest keyframe.
2. The system of claim 1 , wherein the pre-defined distance is in a range of 30 centimeter (cm) to 50 cm.
3. The system of claim 1 , wherein the computer executable code is configured to compute the transition velocity and the rotation velocity based on a transition difference and an angular difference between the current pose of the robotic device and the target waypoint, and the transition difference t D and the angular difference θ D are calculated by:
t D =∥t T −t R ∥, and
θ D =|θ T −θ R |,
wherein t T is a location of the target waypoint in the visual sparse map, t R is a location of the robotic device in the visual sparse map, and t D is the transition difference between the location of the robotic device and the location of the target waypoint; and
wherein θ T is an orientation of the target waypoint in 2D space of the visual sparse map, θ R is an orientation of the robotic device in the 2D space of the visual sparse map, and θ D is the angular difference between the orientation of the robotic device and the orientation of the target waypoint.
4. The system of claim 3 , wherein the computer executable code is configured to compute the transition velocity V T and the rotation velocity V θ by:
V
T
=
{
V
m
if
θ
D
≤
θ
h
V
m
2
if
θ
D
>
θ
h
,
and
V
θ
=
θ
D
α
,
wherein V m is a desired maximum translation speed of the robotic device, θ h is a threshold of angular difference for reducing V T , and α is an empirical coefficient.
5. The system of claim 4 , wherein θ h is about 55-60 degrees, and a is in a range of 3-12.
6. The system of claim 1 , wherein the visual sensor is an RGB-D camera, the visual sparse map is obtained based on collected RGB-D images by the visual sensor, and the image captured is RGB image.
7. The system of claim 6 , wherein the computer executable code is configured to obtain the visual sparse map by:
collecting each of the RGB-D images;
extracting feature points from each of the RGB-D images; and
for each of the RGB-D images:
predicting a relative pose between the current image and a local map based on the extracted feature points and feature points in the local map;
determining if the current image is a new keyframe by comparing the extracted feature points with feature points in a last keyframe;
optimizing the local map using the new keyframe;
determining loop closure based on the new keyframe; and
storing the extracted feature points and the new keyframe to obtain the visual sparse map.
8. The system of claim 7 , wherein the step of predicting a relative pose is performed using at least one of motion model, visual odometry, and relocalization.
9. The system of claim 7 , wherein the current image is determined to be the new keyframe if a number of matched points between the current image and the last keyframe is smaller than a threshold.
10. The system of claim 7 , wherein the computer executable code is configured to delete an old keyframe when a number of keyframes in the local map is greater than a threshold.
11. The system of claim 7 , wherein the step of determining loop closure uses a place recognition database consisting of a visual vocabulary.
12. A method for autonomous navigation using a visual sparse map, comprising:
obtaining the visual sparse map, the visual sparse map comprising a plurality of visual feature points and a plurality of keyframes;
capturing an image using the visual sensor;
acquiring, by a processor of the robotic device, a current pose of the robotic device based on the image and the visual sparse map;
finding, by the processor, a nearest keyframe, wherein the nearest keyframe is one of the plurality of keyframes that is nearest to the current pose of the robotic device;
finding, by the processor, a target waypoint, wherein the target waypoint is one of the plurality of keyframes that is ahead of the nearest keyframe and has a pre-determined distance to the nearest keyframe;
computing, by the processor, transition velocity and rotation velocity of the robotic device based on relative location between the robotic device and the target waypoint; and
controlling, by the processor, operation of the robotic device using the computed transition velocity and rotation velocity to achieve autonomous navigation,
wherein the plurality of keyframes are listed as {x 0 , x 1 , . . . , x k-1 , x k }, x k is defined by three dimensional coordinates of the k th keyframe, k is an index of x k , and the target waypoint has an index greater than an index of the nearest keyframe.
13. The method of claim 12 , wherein the pre-defined distance is in a range of 30 cm to 50 cm.
14. The method of claim 12 , wherein the step of computing the transition velocity and the rotation velocity is performed based on a transition difference and an angular difference between the current pose of the robotic device and the target waypoint, and the transition difference t D and the angular difference θ D are calculated by:
t D =∥t T −t R ∥, and
θ D =|θ T −θ R |,
wherein t T is a location of the target waypoint in the visual sparse map, t R is a location of the robotic device in the visual sparse map, and t D is the transition difference between the location of the robotic device and the location of the target waypoint; and
wherein θ T is an orientation of the target waypoint in 2D space of the visual sparse map, θ R is an orientation of the robotic device in the 2D space of the visual sparse map, and θ D is the angular difference between the orientation of the robotic device and the orientation of the target waypoint.
15. The method of claim 14 , wherein the transition velocity V T and the rotation velocity V θ are computed by:
V
T
=
{
V
m
if
θ
D
≤
θ
h
V
m
2
if
θ
D
>
θ
h
,
and
V
θ
=
θ
D
α
,
wherein V m is a desired maximum translation speed of the robotic device, θ h is a threshold of angular difference for reducing V T , and a is an empirical coefficient.
16. The method of claim 15 , wherein θ h is about 55-60 degrees, a is in a range of 3-12, the visual sensor is an RGB-D camera, the visual sparse map is obtained based on RGB-D images collected by the visual sensor, and the image captured is an RGB image.
17. A non-transitory computer readable medium storing computer executable code, wherein the computer executable code, when executed at a processor of a robotic device, is configured to:
obtain the visual sparse map, the visual sparse map comprising a plurality of visual feature points and a plurality of keyframes;
capture an image using the visual sensor;
acquire a current pose of the robotic device based on the image and the visual sparse map;
find a nearest keyframe, wherein the nearest keyframe is one of the plurality of keyframes that is nearest to the current pose of the robotic device;
find a target waypoint, wherein the target waypoint is one of the plurality of keyframes that is ahead of the nearest keyframe and has a pre-determined distance to the nearest frame;
compute transition velocity and rotation velocity of the robotic device based on relative location between the robotic device and the target waypoint; and
control operation of the robotic device using the computed transition velocity and rotation velocity to achieve autonomous navigation,
wherein the plurality of keyframes are listed as {x 0 , x 1 , . . . , x k-1 , x k }, x k is defined by three dimensional coordinates of the k th keyframe, k is an index of x k , and the target waypoint has an index greater than an index of the nearest keyframe.
18. The non-transitory computer readable medium of claim 17 , wherein the computer executable code is configured to compute the transition velocity V T and the rotation velocity V θ by:
V
T
=
{
V
m
if
θ
D
≤
θ
h
V
m
2
if
θ
D
>
θ
h
,
and
V
θ
=
θ
D
α
,
wherein V m is a desired maximum translation speed of the robotic device, θ D is the angular difference between the orientation of the robotic device and the orientation of the target waypoint, θ h is a threshold of angular difference for reducing V T , and a is an empirical coefficient.
19. The non-transitory computer readable medium of claim 17 , wherein the pre-defined distance is in a range of 30 cm to 50 cm, θ h is about 55-60 degrees, and α is in a range of 3-12.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.