US11216739B2ActiveUtilityPatentIndex 50
System and method for automated analysis of ground truth using confidence model to prioritize correction options
Est. expiryJul 25, 2038(~12.1 yrs left)· nominal 20-yr term from priority
G06N 3/042G06N 20/00G06N 3/006G06N 5/048G06N 5/022G06N 5/02G06N 5/041
50
PatentIndex Score
0
Cited by
23
References
15
Claims
Abstract
A method, system and computer-usable medium are disclosed for automated analysis of ground truth using confidence model to prioritize correction options. In certain embodiments, the ground truth data is analyzed to identify review-candidates. A confidence level may be assigned to each of the identified review-candidates and the review-candidates are prioritized, at least in part, using the assigned confidence levels. The review-candidates are electronically presented in prioritized order to solicit verification or correction feedback for updating the ground truth data.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A computer-implemented method for automated analysis of ground truth using an information processing system having a processor and a memory, the method comprising:
receiving, by the information processing system, ground truth data;
analyzing, by the information processing system, the ground truth data to identify review-candidates;
assigning, by the information processing system, a confidence level to each of the identified review-candidates;
prioritizing, by the information processing system, the review-candidates based at least on the assigned confidence levels;
electronically presenting, by the information processing system, the review-candidates in prioritized order to solicit corrective feedback for updating the ground truth data;
generating, by the information processing system, suggested fixes for the review-candidates; and
grouping identified review candidates having the same suggested fixes;
electronically presenting the grouped review-candidates in prioritized order along with the suggested fixes to solicit corrective feedback for updating the ground truth data using the suggested fixes; and,
training a question answer (QA) system using the suggested fixes.
2. The computer-implemented method of claim 1 , wherein prioritizing the review-candidates further comprises:
prioritizing a review-candidate based on an impact of changing the review-candidate in the ground truth data using one or more of the respective suggested fixes.
3. The computer-implemented method of claim 2 , wherein
the impact of changing the review-candidate in the ground truth data is based, at least in part, on a number of ground truth data entries that would be changed using the respective suggested fixes.
4. The computer-implemented method of claim 1 , further comprising:
identifying, by the information processing system, review-candidates based on similarities between different attribute names; and
assigning, by the information processing system, a high confidence level to review-candidates having different attribute names within a predetermined edit distance.
5. The computer-implemented method of claim 1 , further comprising:
identifying, by the information processing system, review-candidates based on differences in data types in ground truth entries for a given attribute; and
assigning, by the information processing system, a high confidence level to review-candidates having different data types for the given attribute.
6. A system comprising:
a processor;
a data bus coupled to the processor; and
a non-transitory, computer-readable storage medium embodying computer program code, the non-transitory, computer-readable storage medium being coupled to the data bus, the computer program code interacting with a plurality of computer operations and comprising instructions executable by the processor and configured for:
receiving ground truth data;
analyzing the ground truth data to identify review-candidates;
assigning a confidence level to each of the identified review-candidates;
prioritizing the review-candidates based at least on the assigned confidence levels;
electronically presenting the review-candidates in prioritized order to solicit corrective feedback for updating the ground truth data;
generating, by the information processing system, suggested fixes for the review-candidates; and
grouping identified review candidates having the same suggested fixes;
electronically presenting the grouped review-candidates in prioritized order along with the suggested fixes to solicit corrective feedback for updating the ground truth data using the suggested fixes; and,
training a question answer (QA) system using the suggested fixes.
7. The system of claim 6 , wherein prioritizing the review-candidates further comprises:
prioritizing a review-candidate based on an impact of changing the review-candidate in the ground truth data using one or more of the respective suggested fixes.
8. The system of claim 7 , wherein:
the impact of changing the review-candidate in the ground truth data is based, at least in part, on a number of ground truth data entries that would be changed using the respective suggested fixes.
9. The system of claim 6 , wherein the instructions are further configured for:
identifying review-candidates based on similarities between different attribute names; and
assigning a high confidence level to review-candidates having different attribute names within a predetermined edit distance.
10. The system of claim 6 , wherein the instructions are further configured for:
identifying review-candidates based on differences in data types in ground truth entries for a given attribute; and
assigning a high confidence level to review-candidates having different data types for the given attribute.
11. A non-transitory, computer-readable storage medium embodying computer program code, the computer program code comprising computer executable instructions configured for:
receiving ground truth data;
analyzing the ground truth data to identify review-candidates;
assigning a confidence level to each of the identified review-candidates;
prioritizing the review-candidates based at least on the assigned confidence levels;
electronically presenting the review-candidates in prioritized order to solicit corrective feedback for updating the ground truth data;
generating, by the information processing system, suggested fixes for the review-candidates; and
grouping identified review candidates having the same suggested fixes;
electronically presenting the grouped review-candidates in prioritized order along with the suggested fixes to solicit corrective feedback for updating the ground truth data using the suggested fixes; and,
training a question answer (QA) system using the suggested fixes.
12. The non-transitory, computer-readable storage medium of claim 11 , wherein prioritizing the review-candidates further comprises:
prioritizing a review-candidate based on an impact of changing the review-candidate in the ground truth data using one or more of the respective suggested fixes.
13. The non-transitory, computer-readable storage medium of claim 12 , wherein
the impact of changing the review-candidate in the ground truth data is based, at least in part, on a number of ground truth data entries that would be changed using the respective suggested fixes.
14. The non-transitory, computer-readable storage medium of claim 11 , wherein the instructions are further configured for:
identifying review-candidates based on similarities between different attribute names; and
assigning a high confidence level to review-candidates having different attribute names within a predetermined edit distance.
15. The non-transitory, computer-readable storage medium of claim 11 , wherein the instructions are further configured for:
identifying review-candidates based on differences in data types in ground truth entries for a given attribute; and
assigning a high confidence level to review-candidates having different data types for the given attribute.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.