P
US11599774B2ActiveUtilityPatentIndex 56

Training machine learning model

Assignee: IBMPriority: Mar 29, 2019Filed: Mar 29, 2019Granted: Mar 7, 2023
Est. expiryMar 29, 2039(~12.7 yrs left)· nominal 20-yr term from priority
Inventors:ZHAO SHIWANWu bing zheSU ZHONG
G06N 3/09G06N 3/0464G06N 3/047G06V 10/774G06N 3/084G06T 2207/20081G06V 10/776G06V 10/764G06F 17/18G06V 10/82G06N 3/044G06V 2201/03G06N 20/00G06T 2207/20084G06T 7/0012G06F 18/214G06N 3/045G06K 9/6256G06N 3/0472
56
PatentIndex Score
0
Cited by
22
References
20
Claims

Abstract

Techniques are provided for training machine learning model. According to one aspect, a training data is received by one or more processing units. The machine learning model is trained based on the training data, wherein the training comprises: optimizing the machine learning model based on stochastic gradient descent (SGD) by adding a dynamic noise to a gradient of a model parameter of the machine learning model calculated by the SGD.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A method for training a machine learning model, comprising:
 acquiring, by one or more processing units, a training data; and 
 training, by one or more processing units, the machine learning model based on the training data, the training comprising: 
 optimizing, by one or more processing units, the machine learning model based on stochastic gradient descent (SGD), and 
 minimizing privacy leakage by adding a dynamic noise to a gradient of a model parameter of the machine learning model calculated by the SGD. 
 
     
     
       2. The method of  claim 1 , wherein the machine learning model is a convolutional neural networks (CNN) or a recurrent neural network (RNN). 
     
     
       3. The method of  claim 1 , wherein the training data is selected from the group consisting of: pathological data; autopilot data; medical experimental data; biological data; internet of things (IoT) data; social network data; e-commerce data. 
     
     
       4. The method of  claim 1 , wherein the optimizing further comprises minimizing a loss function of the machine learning model. 
     
     
       5. The method of  claim 4 , wherein the added dynamic noise is selected from a predefined noise set. 
     
     
       6. The method of  claim 5 , further comprising assigning a corresponding probability to each of the noises according to the loss function, wherein each of the noises is with a different scale from each other. 
     
     
       7. The method of  claim 6 , wherein the added dynamic noise is selected based on the probability assigned. 
     
     
       8. The method of  claim 5 , wherein the machine learning model is a CNN, and the predefined noise set comprises noises with three different scales and the training data are labeled pathological images. 
     
     
       9. The method of  claim 1 , wherein the noise is a Gaussian noise. 
     
     
       10. A computer system, comprising: a processor;
 a non-transitory computer-readable memory coupled to the processor, the memory comprising instructions that when executed by the processor perform actions of: 
 acquiring, by one or more processing units, a training data; and 
 training, by one or more processing units, the machine learning model based on the training data, the training comprising: 
 optimizing, by one or more processing units, the machine learning model based on stochastic gradient descent (SGD), and 
 minimizing privacy leakage by adding a dynamic noise to a gradient of a model parameter of the machine learning model calculated by the SGD. 
 
     
     
       11. The system of  claim 10 , wherein the machine learning model is a convolutional neural networks (CNN) or a recurrent neural network (RNN). 
     
     
       12. The system of  claim 10 , wherein the training data is selected from the group consisting of: pathological data; autopilot data; medical experimental data; biological data; internet of things (IoT) data; social network data; e-commerce data. 
     
     
       13. The system of  claim 10 , wherein the optimizing further comprises minimizing a loss function of the machine learning model. 
     
     
       14. The system of  claim 13 , wherein the added dynamic noise is selected from a predefined noise set. 
     
     
       15. The system of  claim 14 , further comprising assigning a corresponding probability to each of the noises according to the loss function, wherein each of the noises is with a different scale from each other. 
     
     
       16. The system of  claim 15 , wherein the added dynamic noise is selected based on the probability assigned. 
     
     
       17. The system of  claim 14 , wherein the machine learning model is a CNN, and the predefined noise set comprises noises with three different scales and the training data are labeled pathological images. 
     
     
       18. The system of  claim 10 , wherein the noise is a Gaussian noise. 
     
     
       19. A computer program product for training a machine learning model, comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
 acquiring, by one or more processing units, a training data; 
 training, by one or more processing units, the machine learning model based on the training data, the training comprising:
 optimizing, by one or more processing units, the machine learning model based on stochastic gradient descent (SGD) by adding a dynamic noise, selected from a predefined noise set, to a gradient of a model parameter of the machine learning model calculated by the SGD wherein the optimizing further comprises minimizing a loss function of the machine learning model, and 
 assigning a corresponding probability to each of the noises in the predefined noise set according to the loss function, wherein each of the noises is with a different scale from each other. 
 
 
     
     
       20. The computer program product of  claim 19 , wherein the training data is selected from the group consisting of: pathological data; autopilot data; medical experimental data; biological data; internet of things (IoT) data; social network data; e-commerce data.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.