Bootstrapping topic detection in conversations
Abstract
A computer system and method identifies topics in conversations, such as a conversation between a doctor and patient during a medical examination. The system and method generates, based on first text (such as a document corpus including previous clinical documentation), a plurality of sentence embeddings representing a plurality of semantic representations in a plurality of sentences in the training text. The system and method generate a classifier based on the second text, which includes a plurality of sections associated with a plurality of topics, and the plurality of sentence embeddings. The system and method generate, based on a sentence (such as a sentence in a doctor-patient conversation) and the classifier, an identifier of a topic to associate with the first sentence. The system and method may also insert the sentence into a section, associated with the identified topic, in a document (such as a clinical note).
Claims
exact text as granted — not AI-modifiedThe invention claimed is:
1. A method, for identifying a first topic represented by a first sentence, performed by at least one computer processor executing computer program instructions stored on at least one non-transitory computer readable medium, the method comprising:
(A) generating, based on first text, a plurality of sentence embeddings representing a plurality of semantic representations of a plurality of sentences in the training text;
(B) generating, based on second text and the plurality of sentence embeddings, the second text comprising a plurality of sections associated with a plurality of topics, a classifier;
(C) generating, based on the first sentence and the classifier, a first identifier of the first topic to associate with the first sentence; and
(D) inserting the first sentence into a first section of a first document, the first section being associated with the first topic.
2. The method of claim 1 , further comprising:
(E) generating, based on a second sentence and the classifier, a second identifier of a second topic to associate with the second sentence.
3. The method of claim 2 , further comprising:
(F) inserting the second sentence into a second section of the first document, the second section being associated with the second topic.
4. The method of claim 1 :
wherein the second text comprises a plurality of documents;
wherein the plurality of documents comprises a first document comprising a first section in the plurality of sections, wherein the first section is associated with a first one of the plurality of topics; and
wherein the plurality of documents comprises a second document comprising a second section in the plurality of sections, wherein the second section is associated with the first one of the plurality of topics.
5. The method of claim 4 :
wherein the first document comprises a third section in the plurality of sections, wherein the third section is associated with a second one of the plurality of topics; and
wherein the second document comprises a fourth section in the plurality of sections, wherein the fourth section is associated with the second one of the plurality of topics.
6. The method of claim 1 , further comprising:
(E) generating, based on the classifier and data representing an utterance, an identifier of a topic to associate with the utterance.
7. The method of claim 1 , wherein the first text includes the second text.
8. A system comprising a non-transitory computer-readable medium having computer-readable instructions stored thereon, wherein the computer-readable instructions are executable by at least one computer processor to perform a method for identifying a first topic represented by a first sentence, the method comprising:
(A) generating, based on first text, a plurality of sentence embeddings representing a plurality of semantic representations of a plurality of sentences in the training text;
(B) generating, based on second text and the plurality of sentence embeddings, the second text comprising a plurality of sections associated with a plurality of topics, a classifier;
(C) generating, based on the first sentence and the classifier, a first identifier of the first topic to associate with the first sentence; and
(D) inserting the first sentence into a first section of a first document, the first section being associated with the first topic.
9. The system of claim 8 , wherein the method further comprises:
(E) generating, based on a second sentence and the classifier, a second identifier of a second topic to associate with the second sentence.
10. The system of claim 9 , wherein the method further comprises:
(F) inserting the second sentence into a second section of the first document, the second section being associated with the second topic.
11. The system of claim 8 :
wherein the second text comprises a plurality of documents;
wherein the plurality of documents comprises a first document comprising a first section in the plurality of sections, wherein the first section is associated with a first one of the plurality of topics; and
wherein the plurality of documents comprises a second document comprising a second section in the plurality of sections, wherein the second section is associated with the first one of the plurality of topics.
12. The system of claim 11 :
wherein the first document comprises a third section in the plurality of sections, wherein the third section is associated with a second one of the plurality of topics; and
wherein the second document comprises a fourth section in the plurality of sections, wherein the fourth section is associated with the second one of the plurality of topics.
13. The system of claim 8 , wherein the method further comprises:
(E) generating, based on the classifier and data representing an utterance, an identifier of a topic to associate with the utterance.
14. The system of claim 8 , wherein the first text includes the second text.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.