P
US12411911B1ActiveUtilityPatentIndex 37

Entity segmentation by event rate optimization

Assignee: INTUIT INCPriority: May 30, 2025Filed: May 30, 2025Granted: Sep 9, 2025
Est. expiryMay 30, 2045(~18.9 yrs left)· nominal 20-yr term from priority
Inventors:PERKINS RAYMONDYEH STEVENROY ATANUMCINTYRE BRENDANSAH SHUBHAMSHI CHENYU
G06N 20/10G06N 5/01G06N 3/08G06N 20/20G06N 7/01G06N 20/00G06F 18/241G06F 9/5027
37
PatentIndex Score
0
Cited by
5
References
20
Claims

Abstract

Aspects of the present disclosure relate to machine learning-based techniques for segmenting entity populations. Embodiments include approximating distributions for outputs generated by a classification machine learning model and ground truth occurrences of a targeted event. A first distribution may relate to the distribution of classification model scores that indicate the likelihood of the targeted event occurring with respect to entities. A second distribution may relate to the distribution of actual occurrences of the targeted event. Based on the distributions, thresholds may be generated by minimizing the values of the thresholds as a function of the distributions and a targeted rate of occurrence for the targeted entity with respect to different segments that are included within the thresholds. Entities may then be segmented based on the thresholds, and interventions (such as resource allocations) may be applied based on the segmentation.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A machine learning-based method, comprising:
 generating, for a set of training entities using a classification machine learning model, outputs that indicate likelihoods of a particular event occurring with respect to the set of training entities, wherein each respective training entity of the set of training entities is associated with a respective label indicating whether the particular event occurred with respect to the respective training entity; 
 generating, using a given machine learning model, one or more distribution thresholds based on:
 approximating a first distribution for the outputs generated by the classification machine learning model; 
 approximating a second distribution for occurrences of the particular event with respect to the set of training entities; and 
 generating a given distribution threshold based on minimizing a value for the given distribution threshold, wherein the value for the given distribution threshold is generated as a function of the first distribution, the second distribution, and a targeted rate of occurrence for the particular event with respect to entities having corresponding likelihoods of the particular event occurring that are below the given distribution threshold; 
 
 generating, for a received entity using the classification machine learning model, a given output that indicates a likelihood of the particular event occurring with respect to the received entity; and 
 performing a given intervention with respect to the received entity based on the likelihood for the received entity exceeding the given distribution threshold. 
 
     
     
       2. The method of  claim 1 , further comprising performing a second intervention with respect to entities having respective likelihoods of the particular event occurring that are below the given distribution threshold. 
     
     
       3. The method of  claim 2 , wherein the given intervention comprises allocating more processor resources than are allocated in the second intervention. 
     
     
       4. The method of  claim 1 , further comprising generating a particular distribution threshold that is higher than the given distribution threshold, wherein the particular distribution threshold is generated based on minimizing a value for the particular distribution threshold, wherein the value for the particular distribution threshold is generated as a function of the first distribution, the second distribution, and a targeted rate of occurrence for the particular event with respect to entities having respective likelihoods of the particular event occurring that are above the particular distribution threshold. 
     
     
       5. The method of  claim 4 , wherein a third intervention is performed with respect to entities having associated likelihoods of the particular event occurring that are above the particular distribution threshold. 
     
     
       6. The method of  claim 1 , wherein generating the given distribution threshold is further based on the rate of occurrence of the particular event with respect to the entities having the corresponding likelihoods of the particular event occurring that are below the given distribution threshold being less than a recall rate of the classification machine learning model. 
     
     
       7. The method of  claim 1 , wherein generating the given distribution threshold is further based on a number of the entities having the corresponding likelihoods of the particular event occurring that are below the given distribution threshold being less than a number of entities having respective likelihoods of the particular event occurring that are above the given distribution threshold. 
     
     
       8. A machine learning-based method, comprising:
 generating, for each respective entity of a set of entities using a classification machine learning model, a respective output that indicates a likelihood of a particular event occurring with respect to the respective entity; 
 generating, using a given machine learning model, one or more distribution thresholds based on:
 approximating a first distribution for the outputs generated by the classification machine learning model; 
 approximating a second distribution for occurrences of the particular event based on occurrences of the particular event with respect to a training subset of the set of entities, wherein each respective training entity of the subset of training entities is associated with a respective label indicating whether the particular event occurred with respect to the respective training entity, wherein the second distribution is approximated based on outputs generated by the classification machine learning model with respect to each entity of the training set of entities; and 
 generating a given distribution threshold based on minimizing a value for the given distribution threshold, wherein the value for the given distribution threshold is generated as a function of the first distribution, the second distribution, and a targeted rate of occurrence for the particular event with respect to entities having corresponding likelihoods of the particular event occurring that are below the given distribution threshold; 
 
 performing a first intervention protocol with respect to entities having generated likelihoods of the particular event occurring that are below the given distribution threshold; and 
 performing a second intervention protocol with respect to entities having generated likelihoods of the particular event occurring that are above the given distribution threshold. 
 
     
     
       9. The method of  claim 8 , further comprising generating a particular distribution threshold that is higher than the given distribution threshold, wherein the particular distribution threshold is generated based on minimizing a value for the particular distribution threshold, wherein the value for the particular distribution threshold is generated as a function of the first distribution, the second distribution, and a targeted rate of occurrence for the particular event with respect to entities having likelihoods that are above the particular distribution threshold. 
     
     
       10. The method of  claim 9 , further comprising performing a third intervention protocol with respect to entities having likelihoods above the particular distribution threshold. 
     
     
       11. The method of  claim 9 , wherein the second intervention protocol is performed with respect to entities having likelihoods below the particular distribution threshold and above the given distribution threshold. 
     
     
       12. The method of  claim 8 , wherein generating the given distribution threshold is further based on the rate of occurrence of the particular event with respect to entities having likelihoods below the given distribution threshold being less than a recall rate of the classification machine learning model. 
     
     
       13. The method of  claim 8 , wherein generating the given distribution threshold is further based on a number of entities having likelihoods below the given distribution threshold being less than a number of entities having likelihoods above the given distribution threshold. 
     
     
       14. The method of  claim 8 , wherein the second intervention protocol comprises allocating more processor resources than are allocated in the first intervention protocol. 
     
     
       15. A system, comprising:
 one or more processors; and 
 a memory comprising instructions that, when executed by the one or more processors, cause the system to: 
 generate, for a set of training entities using a classification machine learning model, outputs that indicate likelihoods of a particular event occurring with respect to the set of training entities, wherein each respective training entity of the set of training entities is associated with a respective label indicating whether the particular event occurred with respect to the respective training entity; 
 generate, using a given machine learning model, one or more distribution thresholds based on:
 approximating a first distribution for the outputs generated by the classification machine learning model; 
 approximating a second distribution for occurrences of the particular event with respect to the set of training entities; and 
 generating a given distribution threshold based on minimizing a value for the given distribution threshold, wherein the value for the given distribution threshold is generated as a function of the first distribution, the second distribution, and a targeted rate of occurrence for the particular event with respect to entities having corresponding likelihoods of the particular event occurring that are below the given distribution threshold; 
 
 generate, for a received entity using the classification machine learning model, a given output that indicates a likelihood of the particular event occurring with respect to the received entity; and 
 perform a given intervention with respect to the received entity based on the likelihood for the received entity exceeding the given distribution threshold. 
 
     
     
       16. The system of  claim 15 , wherein the instructions further cause the system to perform a second intervention with respect to entities having respective likelihoods of the particular event occurring that are below the given distribution threshold. 
     
     
       17. The system of  claim 16 , wherein the given intervention comprises allocating more processor resources than are allocated in the second intervention. 
     
     
       18. The system of  claim 15 , wherein the instructions further cause the system to generate a particular distribution threshold that is higher than the given distribution threshold, wherein the particular distribution threshold is generated based on minimizing a value for the particular distribution threshold, wherein the value for the particular distribution threshold is generated as a function of the first distribution, the second distribution, and a targeted rate of occurrence for the particular event with respect to entities having respective likelihoods of the particular event occurring that are above the particular distribution threshold. 
     
     
       19. The system of  claim 18 , wherein a third intervention is performed with respect to entities having associated likelihoods of the particular event occurring that are above the particular distribution threshold. 
     
     
       20. The system of  claim 15 , wherein generating the given distribution threshold is further based on the rate of occurrence of the particular event with respect to the entities having the corresponding likelihoods of the particular event occurring that are below the given distribution threshold being less than a recall rate of the classification machine learning model.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.