US11354542B2ActiveUtilityPatentIndex 71

On-the-fly deep learning in machine learning at autonomous machines

Assignee: INTEL CORPPriority: May 5, 2017Filed: Feb 6, 2020Granted: Jun 7, 2022

Est. expiryMay 5, 2037(~10.8 yrs left)· nominal 20-yr term from priority

Inventors:YEHEZKEL ROHEKAR RAANAN YONATAN

G06V 10/774G06V 10/82G06V 10/764G06F 18/2148G06N 3/063G06F 18/2413G06F 18/2411G06N 3/045G06N 3/044G06F 18/214G06V 20/00G06V 10/454G06N 3/08G06N 3/09G06N 3/0895G06N 3/0464G06N 3/084G06F 9/505G06V 10/955G06V 40/174G06V 2201/06G06F 9/46G06V 10/95G06N 3/04G06N 3/0445G06K 9/6256G06K 9/6257G06N 3/0454G06K 9/6269G06K 9/627G06T 1/20G06T 1/60G06F 9/5027G06F 9/3887

PatentIndex Score

Cited by

References

Claims

Abstract

A mechanism is described for facilitating on-the-fly deep learning in machine learning for autonomous machines. A method of embodiments, as described herein, includes detecting an output associated with a first deep network serving as a user-independent model associated with learning of one or more neural networks at a computing device having a processor coupled to memory. The method may further include automatically generating training data for a second deep network serving as a user-dependent model, where the training data is generated based on the output. The method may further include merging the user-independent model with the user-dependent model into a single joint model.

Claims

exact text as granted — not AI-modified

What is claimed is: 
     
       1. A server device comprising: a storage device to store a graphics execution environment, the graphics execution environment including a deep learning framework to accelerate deep learning operations via one or more general-purpose graphics processors, the deep learning framework to cause the one or more general-purpose graphics processors to perform operations to: generate output via a first deep neural network (DNN) model, wherein the first DNN model is a pre-trained DNN model for computer vision to enable context-independent classification of an object within an input video frame; extract a feature learned by the first DNN model based on the generated output; generate training data for a second DNN model based on the extracted feature; and train a second DNN model based on the extracted feature, the second DNN model a context-dependent extension of the first DNN model, wherein the deep learning framework is to provide a library of machine learning primitives, the machine learning primitives accelerated via instructions executed by the one or more general-purpose graphics processors and to train the second DNN model includes to train the second DNN model via one or more primitives provided by the deep learning framework, the one or more primitives to implement linear algebra subprograms associated with respective layers of the second DNN model, the respective layers including a fully connected layer. 
     
     
       2. The server device as in  claim 1 , the deep learning framework to cause the one or more general-purpose graphics processors to perform operations to:
 detect an output associated with the first DNN model; 
 generate training data based on the output associated with the first DNN; and 
 train the second DNN model based on the training data independently of the first DNN model. 
 
     
     
       3. The server device as in  claim 2 , the deep learning framework to cause the one or more general-purpose graphics processors to merge the first DNN model with the second DNN model into a joint model. 
     
     
       4. The server device as in  claim 3 , the deep learning framework to cause the one or more general-purpose graphics processors to perform joint model tuning of one or more parameters of the joint model. 
     
     
       5. The server device as in  claim 1 , wherein the library of machine learning primitives includes primitives to perform tensor convolution, at least one activation function, and a pooling operation. 
     
     
       6. The server device as in  claim 1 , wherein the one or more primitives to implement the linear algebra subprograms include primitives to perform matrix operations. 
     
     
       7. A non-transitory machine-readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising: generating output via a first deep neural network (DNN) model via a deep learning framework accelerated via the one or more processors, wherein the first DNN model is a pre-trained DNN model for computer vision that enables context-independent classification of an object within an input video frame and the one or more processors include a general-purpose graphics processor; extracting, via the deep learning framework, a feature learned by the first DNN model; and training, via the deep learning framework, a second DNN model for computer vision based on the extracted feature, the second DNN model a context-dependent extension of the first DNN model, wherein the deep learning framework is to provide a library of machine learning primitives, the machine learning primitives accelerated via instructions executed by the one or more general-purpose graphics processors and training the second DNN model includes training the second DNN model via one or more primitives provided by the deep learning framework, the one or more primitives to implement linear algebra subprograms associated with respective layers of the second DNN model, the respective layers including a fully connected layer. 
     
     
       8. The non-transitory machine-readable medium as in  claim 7 , the operations additionally comprising:
 detecting an output associated with the first DNN model; 
 generating training data based on the output associated with the first DNN; and 
 training the second DNN model based on the training data independently of training performed for the first DNN model. 
 
     
     
       9. The non-transitory machine-readable medium as in  claim 8 , wherein the deep learning framework causes the general-purpose graphics processor to perform operations to merge the first DNN model with the second DNN model into a joint model. 
     
     
       10. The non-transitory machine-readable medium as in  claim 9 , wherein the deep learning framework causes the general-purpose graphics processor to perform joint model tuning of one or more parameters of the joint model. 
     
     
       11. The non-transitory machine-readable medium as in  claim 7 , wherein the library of machine learning primitives includes primitives to perform tensor convolution, at least one activation function, and a pooling operation. 
     
     
       12. The non-transitory machine-readable medium as in  claim 7 , wherein the one or more primitives to implement the linear algebra subprograms include primitives to perform matrix operations. 
     
     
       13. A data processing system on a server device, the data processing system included within a graphics execution environment stored on a server device, the data processing system comprising instructions to provide a deep learning framework to accelerate deep learning operations via one or more general-purpose graphics processors of a computing device configured to host the graphics execution environment, the deep learning framework to cause the one or more general-purpose graphics processors to perform operations comprising: generating output via a first deep neural network (DNN) model via a deep learning framework accelerated via the one or more general-purpose graphics processors, wherein the first DNN model is a pre-trained DNN model for computer vision to enable context-independent classification of an object within an input video frame and the one or more processors include a general-purpose graphics processor; via the deep learning framework, detecting an output associated with the first DNN model to extract a feature learned by the first DNN model; generating training data based on the output associated with the first DNN; and training, via the deep learning framework, a second DNN model for computer vision using the training data to enable the second DNN model to learn the extracted feature, the second DNN model a context-dependent extension of the first DNN model, wherein the deep learning framework is to provide a library of machine learning primitives, the machine learning primitives accelerated via instructions executed by the one or more general-purpose graphics processors and training the second DNN model includes training the second DNN model via one or more primitives provided by the deep learning framework, the one or more primitives to implement linear algebra subprograms associated with respective layers of the second DNN model, the respective layers including a fully connected layer. 
     
     
       14. The data processing system as in  claim 13 , wherein the second DNN model is trained independently of the first DNN model and the deep learning framework is configured to cause the general-purpose graphics processor to perform operations to merge the first DNN model with the second DNN model into a joint model. 
     
     
       15. The data processing system as in  claim 14 , wherein the deep learning framework causes the general-purpose graphics processor to perform joint model tuning of one or more parameters of the joint model. 
     
     
       16. The data processing system as in  claim 13 , wherein the library of machine learning primitives includes primitives to perform tensor convolution, at least one activation function, and a pooling operation. 
     
     
       17. The data processing system as in  claim 13 , wherein the one or more primitives to implement the linear algebra subprograms include primitives to perform matrix operations. 
     
     
       18. The data processing system as in  claim 17 , wherein the one or more primitives to implement the linear algebra subprograms include primitives to perform vector operations. 
     
     
       19. The server device as in  claim 6 , wherein the one or more primitives to implement the linear algebra subprograms include primitives to perform vector operations. 
     
     
       20. The non-transitory machine-readable medium as in  claim 12 , wherein the one or more primitives to implement the linear algebra subprograms include primitives to perform vector operations.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.