US11599774B2ActiveUtilityPatentIndex 56
Training machine learning model
Est. expiryMar 29, 2039(~12.7 yrs left)· nominal 20-yr term from priority
G06N 3/09G06N 3/0464G06N 3/047G06V 10/774G06N 3/084G06T 2207/20081G06V 10/776G06V 10/764G06F 17/18G06V 10/82G06N 3/044G06V 2201/03G06N 20/00G06T 2207/20084G06T 7/0012G06F 18/214G06N 3/045G06K 9/6256G06N 3/0472
56
PatentIndex Score
0
Cited by
22
References
20
Claims
Abstract
Techniques are provided for training machine learning model. According to one aspect, a training data is received by one or more processing units. The machine learning model is trained based on the training data, wherein the training comprises: optimizing the machine learning model based on stochastic gradient descent (SGD) by adding a dynamic noise to a gradient of a model parameter of the machine learning model calculated by the SGD.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A method for training a machine learning model, comprising:
acquiring, by one or more processing units, a training data; and
training, by one or more processing units, the machine learning model based on the training data, the training comprising:
optimizing, by one or more processing units, the machine learning model based on stochastic gradient descent (SGD), and
minimizing privacy leakage by adding a dynamic noise to a gradient of a model parameter of the machine learning model calculated by the SGD.
2. The method of claim 1 , wherein the machine learning model is a convolutional neural networks (CNN) or a recurrent neural network (RNN).
3. The method of claim 1 , wherein the training data is selected from the group consisting of: pathological data; autopilot data; medical experimental data; biological data; internet of things (IoT) data; social network data; e-commerce data.
4. The method of claim 1 , wherein the optimizing further comprises minimizing a loss function of the machine learning model.
5. The method of claim 4 , wherein the added dynamic noise is selected from a predefined noise set.
6. The method of claim 5 , further comprising assigning a corresponding probability to each of the noises according to the loss function, wherein each of the noises is with a different scale from each other.
7. The method of claim 6 , wherein the added dynamic noise is selected based on the probability assigned.
8. The method of claim 5 , wherein the machine learning model is a CNN, and the predefined noise set comprises noises with three different scales and the training data are labeled pathological images.
9. The method of claim 1 , wherein the noise is a Gaussian noise.
10. A computer system, comprising: a processor;
a non-transitory computer-readable memory coupled to the processor, the memory comprising instructions that when executed by the processor perform actions of:
acquiring, by one or more processing units, a training data; and
training, by one or more processing units, the machine learning model based on the training data, the training comprising:
optimizing, by one or more processing units, the machine learning model based on stochastic gradient descent (SGD), and
minimizing privacy leakage by adding a dynamic noise to a gradient of a model parameter of the machine learning model calculated by the SGD.
11. The system of claim 10 , wherein the machine learning model is a convolutional neural networks (CNN) or a recurrent neural network (RNN).
12. The system of claim 10 , wherein the training data is selected from the group consisting of: pathological data; autopilot data; medical experimental data; biological data; internet of things (IoT) data; social network data; e-commerce data.
13. The system of claim 10 , wherein the optimizing further comprises minimizing a loss function of the machine learning model.
14. The system of claim 13 , wherein the added dynamic noise is selected from a predefined noise set.
15. The system of claim 14 , further comprising assigning a corresponding probability to each of the noises according to the loss function, wherein each of the noises is with a different scale from each other.
16. The system of claim 15 , wherein the added dynamic noise is selected based on the probability assigned.
17. The system of claim 14 , wherein the machine learning model is a CNN, and the predefined noise set comprises noises with three different scales and the training data are labeled pathological images.
18. The system of claim 10 , wherein the noise is a Gaussian noise.
19. A computer program product for training a machine learning model, comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
acquiring, by one or more processing units, a training data;
training, by one or more processing units, the machine learning model based on the training data, the training comprising:
optimizing, by one or more processing units, the machine learning model based on stochastic gradient descent (SGD) by adding a dynamic noise, selected from a predefined noise set, to a gradient of a model parameter of the machine learning model calculated by the SGD wherein the optimizing further comprises minimizing a loss function of the machine learning model, and
assigning a corresponding probability to each of the noises in the predefined noise set according to the loss function, wherein each of the noises is with a different scale from each other.
20. The computer program product of claim 19 , wherein the training data is selected from the group consisting of: pathological data; autopilot data; medical experimental data; biological data; internet of things (IoT) data; social network data; e-commerce data.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.