US11501008B2ActiveUtilityPatentIndex 59
Differential privacy using a multibit histogram
Est. expiryJun 4, 2037(~10.9 yrs left)· nominal 20-yr term from priority
G06F 17/18H04L 2209/42H04L 9/3239H04L 2209/34G06F 16/26G06F 17/16G06F 21/6254G06F 21/606G06F 17/145
59
PatentIndex Score
0
Cited by
85
References
20
Claims
Abstract
Embodiments described herein ensure differential privacy when transmitting data to a server that estimates a frequency of such data amongst a set of client devices. The differential privacy mechanism may provide a predictable degree of variance for frequency estimations of data. The system may use a multibit histogram model or Hadamard multibit model for the differential privacy mechanism, both of which provide a predictable degree of accuracy of frequency estimations while still providing mathematically provable levels of privacy.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A non-transitory machine-readable medium storing instructions which, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising:
selecting a value of user data to transmit to a server, the value selected from a set of user data values collected on a client device;
encoding the selected value using a vector of a set of bit values, wherein the encoding updates a bit value of the vector at a bit position corresponding to the value of user data;
generating a privatized vector by selectively flipping a sign, in accordance with a predefined probability based on a privacy parameter, of one or more bit values of the vector on the client device; and
transmitting the privatized vector to the server, wherein the server performs a summation operation with the privatized vector to estimate a frequency of the value of user data amongst a set of different client devices.
2. The non-transitory machine-readable medium of claim 1 , wherein the privatized vector is locally differentially private.
3. The non-transitory machine-readable medium of claim 1 , wherein the set of user data values is an indexed set of data values.
4. The non-transitory machine-readable medium of claim 3 , wherein encoding the selected value using a vector of bit values includes initializing the vector of bit values and encoding updates to the bit value at the bit position corresponding to the value of user data includes flipping the sign of the value of user data.
5. The non-transitory machine-readable medium of claim 1 , wherein the summation operation performed by the server includes summing a privatized vector value from each client device in a set of multiple client devices to determine a frequency of a user data value from the set of multiple client devices.
6. The non-transitory machine-readable medium of claim 5 , wherein a sum of a particular value of user data maps to a Gaussian distribution and wherein each bit of the privatized vector is independent and identically distributed.
7. The non-transitory machine-readable medium of claim 1 , wherein the value of user data represents information related to one or more device features used by a user associated with the client device.
8. The non-transitory machine-readable medium of claim 1 , wherein the value of user data represents health data for a user that has been collected by a user device.
9. The non-transitory machine-readable medium of claim 1 , wherein the value of user data represents an activity type performed by a user.
10. A device, comprising:
a processor; and
a memory coupled to the processor, the memory storing instructions, which when executed by the processor, cause the processor to perform operations to:
select a value of user data to transmit to a server, the value selected from a set of user data values collected on a client device;
encode the selected value using a vector of a set of bit values, wherein the encoding updates a bit value of the vector at a bit position corresponding to the value of user data;
generate a privatized vector by selectively flipping a sign, in accordance with a predefined probability based on a privacy parameter, of one or more bit values of the vector on the client device; and
transmit the privatized vector to the server, wherein the server performs a summation operation with the privatized vector to estimate a frequency of the value of user data amongst a set of different client devices.
11. The device as in claim 10 , wherein the privatized vector is locally differentially private.
12. The device of claim 10 , wherein to update the value in the vector includes to flip the sign of the value of the user data encoded into the vector.
13. The device of claim 10 , wherein the summation operation performed by the server is to determine a frequency of a user data value from the set of different client devices and a sum of a particular value of user data maps to a Gaussian distribution.
14. The device of claim 10 , wherein the value of user data represents information related to one or more device features used by a user associated with the client device.
15. The device of claim 10 , wherein the value of user data represents health data for a user that has been collected by a user device.
16. The device of claim 10 , wherein the value of user data represents an activity type performed by a user.
17. A method, the method comprising:
selecting a value of user data to transmit to a server, the value selected from a set of user data values collected on a client device;
encoding the selected value using a vector of a set of bit values, wherein the encoding updates a bit value of the vector at a bit position corresponding to the value of user data;
generating a privatized vector by selectively flipping a sign, in accordance with a predefined probability based on a privacy parameter, of one or more bit values of the vector on the client device; and
transmitting the privatized vector to the server, wherein the server performs a summation
operation with the privatized vector to estimate a frequency of the value of user data amongst a set of different client devices.
18. The method of claim 17 , wherein the set of user data values is an indexed set of data values.
19. The method of claim 18 , wherein encoding the selected value using a vector of bit values includes initializing the vector of bit values and encoding updates to the bit value at the bit position corresponding to the value of user data includes flipping the sign of the value of user data.
20. The method of claim 17 , wherein the summation operation performed by the server includes summing a privatized vector value from each client device in the set of different client devices to determine a frequency of a user data value from the set of different client devices.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.