US11537844B2ActiveUtilityPatentIndex 48

Systems and methods of business categorization and service recommendation

Assignee: INTUIT INCPriority: Feb 3, 2020Filed: Feb 3, 2020Granted: Dec 27, 2022

Est. expiryFeb 3, 2040(~13.6 yrs left)· nominal 20-yr term from priority

Inventors:KATZENELSON EREZ SROR ELIK MEDALION SHLOMI SHAHAR SHIMON LADOR SHIR MEIR BECHLER SIGALIT ZHICHAREVICH ALEXANDER BAR ONN

G06Q 30/04G06N 3/045G06Q 40/12G06F 16/9535G06N 3/048G06N 20/00G06N 3/044G06N 3/042G06N 3/084G06N 3/0445G06N 3/0427G06N 3/0454G06N 3/08G06N 3/0442G06N 3/09G06N 3/0455G06N 3/0464

PatentIndex Score

Cited by

References

Claims

Abstract

A method for recommending offerings to a business may include: receiving a request for recommended business offerings from a device; receiving business data associated with a business from the device, the business data comprising invoice data associated with the business; embedding the business data to a vector space to obtain a business vector, the vector space comprising a plurality of other vectors associated with other businesses; calculating a relation metric between the business vector and a vector of the plurality of other vectors, the vector being associated with a second business, the relation metric representing a degree of relation between the business and the second business; determining that the relation metric is above a pre-defined threshold value; and responsive to the determining, sending business data associated with the second business to the device, the business data associated with the second business comprising invoice data associated with the second business.

Claims

exact text as granted — not AI-modified

The invention claimed is: 
     
       1. A method for recommending offerings to a business performed by a server, said method comprising:
 receiving a request for recommended business offerings from a device; 
 receiving business data associated with a business from the device, the business data comprising invoice data associated with the business; 
 embedding the business data to a vector space to obtain a business vector, the vector space comprising a plurality of other vectors associated with other businesses; 
 calculating a relation metric between the business vector and a vector of the plurality of other vectors, the vector being associated with a second business, the relation metric representing a degree of relation between the business and the second business; 
 determining that the relation metric is above a pre-defined threshold value; and 
 responsive to the determining, sending business data associated with the second business to the device, the business data associated with the second business comprising invoice data associated with the second business. 
 
     
     
       2. The method of  claim 1  further comprising:
 detecting stop words in the business data associated with the business from a pre-defined list of stop words; 
 removing the detected stop words from the business data associated with the business; 
 lemmatizing the business data associated with the business; and 
 embedding the lemmatized business data to the vector space to obtain the business vector. 
 
     
     
       3. The method of  claim 2  further comprising:
 calculating a term frequency-inverse document frequency (TFIDF) value for each word of the lemmatized business data; 
 ranking the calculated TFIDF values; 
 identifying a top pre-defined number of words with a highest rank; and 
 embedding the identified words to obtain the business vector. 
 
     
     
       4. The method of  claim 2  further comprising:
 calculating a term frequency-inverse document frequency (TFIDF) value for each word of the lemmatized business data; 
 identifying a pre-defined number of words with TFIDF values above a certain threshold; and 
 embedding the identified words to obtain the business vector. 
 
     
     
       5. The method of  claim 1 , wherein embedding the business data associated with the business comprises:
 applying a word2vec model, the applying comprising:
 creating a bag-of-words representing the business data, the bag-of-words including each word in the business data and an associated multiplicity of each word; and 
 converting each word in the business data into a vector based on the bag of words and not based on grammar and word order. 
 
 
     
     
       6. The method of  claim 1 , wherein calculating the relation metric comprises calculating an inner product between the business vector and the vector of the plurality of other vectors. 
     
     
       7. The method of  claim 6  further comprising using a sigmoid function on the inner product to keep the relation metric between zero and one, wherein a one corresponds to a related pair of vectors and a zero corresponds to an unrelated pair of vectors. 
     
     
       8. The method of  claim 7 , wherein the server comprises a neural network trained on related and non-related pairs of invoices and business descriptions to:
 embed related business descriptions to similar regions in the vector space; and 
 embed unrelated business descriptions to different regions in the vector space. 
 
     
     
       9. The method of  claim 8 , wherein the neural network is a first neural network, wherein the server comprises a second neural network trained on related and non-related pairs of invoices and business descriptions to:
 embed related invoices to similar regions in the vector space; and 
 embed unrelated invoices to different regions in the vector space. 
 
     
     
       10. A method of training a network to identify related business-invoice pairs performed by a server comprising a first encoder and a second encoder, said method comprising:
 embedding, by the first encoder, a plurality of business descriptions to a vector space to obtain a plurality of business vectors, each business description corresponding to an invoice; 
 embedding, by the second encoder, each corresponding invoice to the vector space to obtain a plurality of invoice vectors; 
 calculating a relation metric for each description-invoice pair; 
 training a neural network with the description-invoice pairs and the corresponding relation metrics to predict whether a new invoice and a new business description are related; 
 training the first encoder to embed related business descriptions to similar regions in the vector space; and 
 training the second encoder to embed related invoices to similar regions in the vector space. 
 
     
     
       11. The method of  claim 10  further comprising training the neural network, the first encoder, and the second encoder jointly in an end-to-end process, the process including:
 receiving a plurality of description-invoice pairs and a plurality of relation metrics as labelled training data, each description-invoice pair including an associated relation metric and a label indicating whether the description-invoice pair is related; and 
 applying a back-propagation algorithm to train each of the neural network, first encoder, and second encoder based on the received description-invoice pairs and associated relation metrics. 
 
     
     
       12. The method of  claim 10  further comprising:
 detecting stop words in the plurality of business descriptions from a pre-defined list of stop words; 
 removing the detected stop words from the plurality of business descriptions; 
 lemmatizing the plurality of business descriptions to obtain lemmatized business descriptions; and 
 embedding, by the first encoder, the lemmatized business descriptions to the vector space to obtain the plurality of business vectors. 
 
     
     
       13. The method of  claim 12  further comprising:
 calculating a term frequency-inverse document frequency (TFIDF) value for each word of each lemmatized business description; 
 ranking the calculated TFIDF values; 
 identifying a top pre-defined number of words with a highest rank; and 
 embedding, by the first encoder, the identified words to obtain the business vector. 
 
     
     
       14. The method of  claim 12  further comprising:
 calculating a term frequency-inverse document frequency (TFIDF) value for each word of each lemmatized business description; 
 identifying a pre-defined number of words with TFIDF values above a certain threshold; and 
 embedding, by the first encoder, the identified words to obtain the business vector. 
 
     
     
       15. The method of  claim 10 , wherein embedding, by the second encoder, each corresponding invoice to the vector space comprises:
 embedding each word of each line item of the invoice to create a plurality of word vectors for each line item; 
 feeding the plurality of word vectors for each line item to a long short-term memory (LSTM) layer; 
 combining, via the LSTM layer, the plurality of word vectors to obtain a line item vector representing each invoice line item; and 
 combining, using a neural network, the plurality of line item vectors to obtain an invoice vector; 
 wherein combining includes at least one of vector addition, vector subtraction, scalar multiplication, sigmoid function multiplication, or hyperbolic function multiplication. 
 
     
     
       16. The method of  claim 10 , wherein calculating the relation metric comprises calculating an inner product between a business vector and an associated invoice vector. 
     
     
       17. The method of  claim 16  further comprising using a sigmoid function on the inner product to keep the relation metric between zero and one, wherein a one corresponds to a related pair of vectors and a zero corresponds to an unrelated pair of vectors. 
     
     
       18. A system for recommending offerings to a business comprising:
 a user device; 
 one or more processors; 
 a server; and 
 a non-transitory computer-readable medium for recommending offerings to a business comprising instructions stored on the server that, when executed by the one or more processors, cause the server to perform a process operable to:
 receive a request for recommended business offerings from the user device; 
 receive business data associated with a business from the user device, the business data comprising invoice data associated with the business; 
 embed the business data to a vector space to obtain a business vector, the vector space comprising a plurality of other vectors associated with other businesses; 
 calculate a relation metric between the business vector and a vector of the plurality of other vectors, the vector being associated with a second business, the relation metric representing a relation between the business and the second business; 
 analyze the relation metric with a neural network trained to determine whether the business vector and the vector of the plurality of other vectors are related; and 
 responsive to the determining, send business data associated with the second business to the user device, the business data associated with the second business comprising invoice data associated with the second business. 
 
 
     
     
       19. The system of  claim 18 , wherein the server is configured to apply a word2vec model, the applying comprising:
 creating a bag-of-words representing the business data associated with the business, the bag-of-words including each word in the business data associated with the business and an associated multiplicity of each word; and 
 converting each word in the business data associated with the business into a vector based on the bag of words and not based on grammar and word order. 
 
     
     
       20. The system of  claim 18 , wherein sending business data associated with the second business to the user device comprises:
 identifying an invoice associated with the second business; 
 anonymizing text of the invoice to create anonymized text; 
 removing stop words from the anonymized text; 
 lemmatizing the anonymized text to create lemmatized text; and 
 sending the lemmatized and anonymized text to the user device.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.