P
US9864746B2ActiveUtilityPatentIndex 68

Association of entity records based on supplemental temporal information

Assignee: IBMPriority: Jan 5, 2016Filed: Jan 5, 2016Granted: Jan 9, 2018
Est. expiryJan 5, 2036(~9.5 yrs left)· nominal 20-yr term from priority
Inventors:GILDER JASON RJAIN ANIL KMILLER JACOB OPOHLMAN MATTHEW M
G16H 10/60G06F 16/93G06Q 10/06G06F 16/215G06F 17/30011G06F 19/322
68
PatentIndex Score
3
Cited by
12
References
19
Claims

Abstract

A system links data objects associated with a common entity and includes at least one processor. The system compares data objects within one or more source systems to identify candidate data objects associated with a corresponding common entity based on information within those data objects. The candidate data objects are analyzed based on supplemental temporal information within the one or more source systems pertaining to the candidate data objects to determine resulting data objects associated with the corresponding common entity. The resulting data objects are linked to form a set of data objects for the common entity. Embodiments of the present invention further include a method and computer program product for linking data objects associated with a common entity.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A computer-implemented method of linking data objects associated with a common entity comprising:
 receiving, at a data center, healthcare data objects stored within one or more computing source systems operatively coupled to the data center, wherein the data center includes a cluster of computer systems to perform parallel processing; 
 arranging the healthcare data objects into blocks by comparing one or more fields of the healthcare data objects to corresponding criteria for the blocks, wherein at least one healthcare data object is assigned to a plurality of different blocks and each block corresponds to a different entity and contains healthcare data objects associated with that different entity; 
 comparing within the blocks, via parallel processing jobs of the cluster of computer systems, the fields of the healthcare data objects to each other to identify candidate healthcare data objects with one or more matching fields and associated with a corresponding common entity; 
 analyzing supplemental temporal information included within the candidate data objects by identifying one or more patterns of changes in physical characteristics in the supplemental temporal information over a period of time to confirm that the candidate data objects are associated with the common entity; 
 linking one or more pairs of the candidate data objects that are confirmed to be associated with the common entity; 
 representing the linked pairs of the candidate data objects graphically and determining disjoint subgraphs of the candidate data objects in the graphical representation to form groups of healthcare data objects respectively associated with a corresponding common entity; 
 comparing fields of the candidate healthcare data objects of the groups to each other and splitting one or more groups into a plurality of sub-groups based on matching fields between the candidate healthcare data objects, wherein the groups and sub-groups each correspond to a different entity; 
 transforming the groups and sub-groups into respective sets of data objects; and 
 processing a query for an entity to retrieve data from the corresponding set of data objects for that entity. 
 
     
     
       2. The computer-implemented method of  claim 1 , wherein the supplemental temporal information includes healthcare information. 
     
     
       3. The computer-implemented method of  claim 2 , wherein the healthcare information includes clinical information pertaining to one or more from a group of diagnoses, medications, vitals, lab tests, genomics, and medical procedures. 
     
     
       4. The computer-implemented method of  claim 1 , wherein the entity includes a patient. 
     
     
       5. The computer-implemented method of  claim 1 , further comprising:
 removing one or more candidate data objects when the supplemental temporal information disassociates the one or more candidate data objects from the common entity. 
 
     
     
       6. The computer-implemented method of  claim 1 , further comprising:
 determining a first matching level for a first pattern of the one or more patterns; 
 determining a second matching level for a second pattern of the one or more patterns; 
 applying a first weight to the first matching level and a second weight to the second matching level based on at least a quality of the one or more computing source systems; and 
 aggregating the weighted first matching level and the weighted second matching level to generate a likelihood score. 
 
     
     
       7. The computer-implemented method of  claim 6 , wherein the likelihood score confirms that the candidate data objects are associated with the common entity when the likelihood score exceeds a predetermined threshold. 
     
     
       8. The computer-implemented method of  claim 1 , wherein the physical characteristics comprise at least one of height and weight. 
     
     
       9. A system for linking data objects associated with a common entity comprising:
 at least one data center configured to receive healthcare data objects stored within one or more computing source systems, wherein the at least one data center includes a cluster of computer systems to perform parallel processing and 
 at least one processor configured to:
 arrange the healthcare data objects into blocks by comparing one or more fields of the healthcare data objects to corresponding criteria for the blocks, wherein at least one healthcare data object is assigned to a plurality of different blocks and each block corresponds to a different entity and contains healthcare data objects associated with that different entity; 
 compare within the blocks, via parallel processing jobs of the cluster of computer systems, the fields of the healthcare data objects to each other to identify candidate healthcare data objects with one or more matching fields and associated with a corresponding common entity; 
 analyze supplemental temporal information included within the candidate data objects by identifying one or more patterns of changes in physical characteristics in the supplemental temporal information over a period of time to confirm that the candidate data objects are associated with the common entity; 
 link one or more pairs of the candidate data objects that are confirmed to be associated with the common entity; 
 represent the linked pairs of the candidate data objects graphically and determine disjoint subgraphs of the candidate data objects in the graphical representation to form group of healthcare data objects respectively associated with a corresponding common entity; 
 compare fields of the candidate healthcare data objects of the groups to each other and split one or more groups into a plurality of sub-groups based on matching fields between the candidate healthcare data objects, wherein the groups and sub-groups each correspond to a different entity; 
 transform the groups and sub-groups into respective sets of data objects; and 
 process a query for an entity to retrieve data from the corresponding set of data objects for that entity. 
 
 
     
     
       10. The system of  claim 9 , wherein the supplemental temporal information includes healthcare information. 
     
     
       11. The system of  claim 10 , wherein the healthcare information includes clinical information pertaining to one or more from a group of diagnoses, medications, vitals, lab tests, genomics, and medical procedures. 
     
     
       12. The system of  claim 9 , wherein the entity includes a patient. 
     
     
       13. The system of  claim 9 , wherein analyzing the supplemental temporal information further comprises:
 removing one or more candidate data objects when the supplemental temporal information disassociates the one or more candidate data objects from the common entity. 
 
     
     
       14. A computer program product for linking data objects associated with a common entity, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by at least one processor of a data center, including a cluster of computer systems to perform parallel processing, to cause the at least one processor to:
 receive healthcare data objects stored within one or more computing source systems; 
 arrange the healthcare data objects into blocks by comparing one or more fields of the healthcare data objects to corresponding criteria for the blocks, wherein at least one healthcare data object is assigned to a plurality of different blocks and each block corresponds to a different entity and contains healthcare data objects associated with that different entity; 
 compare within the blocks, via parallel processing jobs of the cluster of computer systems, the fields of the healthcare data objects to each other to identify candidate healthcare data objects with one or more matching fields and associated with a corresponding common entity; 
 analyze supplemental temporal information included within the candidate data objects by identifying one or more patterns of changes in physical characteristics in the supplemental temporal information over a period of time to confirm that the candidate data objects are associated with the common entity; 
 link one or more pairs of the candidate data objects that are confirmed to be associated with the common entity; 
 represent the linked pairs of the candidate data objects graphically and determine disjoint subgraphs of the candidate data objects in the graphical representation to form groups of healthcare data objects respectively associated with a corresponding common entity; 
 compare fields of the candidate healthcare data objects of the groups to each other and split one or more groups into a plurality of sub-groups based on matching fields between the candidate healthcare data objects, wherein the groups and sub-groups each correspond to a different entity; 
 transform the groups and sub-groups into respective sets of data objects; and 
 process a query for an entity to retrieve data from the corresponding set of data objects for that entity. 
 
     
     
       15. The computer program product of  claim 14 , wherein the supplemental temporal information includes healthcare information and the entity includes a patient. 
     
     
       16. The computer program product of  claim 15 , wherein the healthcare information includes clinical information pertaining to one or more from a group of diagnoses, medications, vitals, lab tests, genomics, and medical procedures. 
     
     
       17. The computer program product of  claim 14 , wherein the program instructions are further configured to cause the at least one processor to:
 remove one or more candidate data objects when the supplemental temporal information disassociates the one or more candidate data objects from the common entity. 
 
     
     
       18. The computer-implemented method of  claim 1 , wherein the method further comprises:
 reducing processing time by performing the method by parallel processing jobs of the cluster of computer systems and reducing a number of comparisons by limiting performance of the comparisons to be between healthcare data objects within a same block. 
 
     
     
       19. The computer-implemented method of  claim 18 , wherein receiving the healthcare data objects further comprises:
 acquiring the healthcare data objects from the one or more computing source systems via a health data gateway configured to retrieve data from the computing source systems; 
 receiving the healthcare data objects from the health data gateway over a network at a gateway controller of the data center; 
 storing the healthcare data objects in a data model of the computing source systems in a staging grid of the data center comprising a plurality of computer systems; and 
 transferring the healthcare data objects from the data model of the staging grid to a second different data model of a factory grid of the data center for access by the parallel processing jobs.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.