Collecting and scoring online references
Abstract
One example embodiment includes a method for indexing online references of an entity. The method includes identifying one or more channels of the Internet to be searched for references to an entity and identifying one or more signals to be evaluated within each of the one or more channels. The method also includes crawling the Internet for online references to the entity, wherein crawling the Internet comprises searching the one or more channels of the Internet for references to the entity and evaluating the one or more signals. The method further includes constructing a reverse index of the references, wherein the reverse index is based on each channel in which a reference is found and the one or more signals evaluated for the reference.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1. A method of indexing online references of an entity, the method comprising:
identifying an entity, wherein the entity is an individual, corporation, brand, or product;
crawling the Internet for online references to the entity after identifying the entity, wherein crawling the Internet comprises:
by a computing device, determining a keyword to search;
by the computing device, querying a first plurality of search engine engines across a plurality of online channels for a first search engine results page pages using a the keyword, wherein the search engine results pages comprise one or more search results identified by the respective search engine as responsive to the keyword;
by the computing device, parsing the first search result engine results pages into to determine organic search results and paid search results, wherein the organic search results appear in the search engine results pages based on relevance of webpages corresponding to the organic search results to the keyword;
by the computing device, identifying an organic online reference to the entity based on references to entities in the organic search results from the search engine results pages and paid online references to entities in the paid search results based on the paid search results from the search engine results pages;
by the computing device, parsing the organic search results to identify one or more organicfirst signals to be evaluated that include information about the organic online reference to the entityreferences and parsing the paid search results to identify one or more second signals to be evaluated that include information about the paid online references;
searching social networks for social network search results that refer to the entity;
identifying a social online reference to the entity based on the social network search results;
parsing the social network search results to identify one or more social third signals to be evaluated that include information about the social online reference to the entity;
by the computing device, evaluating the socialthird signals and organic,the first signals, and the second signals; and
by the computing device, constructing, using a processor, a reverse index of the online references based on the evaluated social third signals and, the evaluated organic first signals, and the evaluated second signals;
by the computing device, in response to receiving a query related to the entity, computing, using the reverse index, statistics regarding organic online references to the identified entity, wherein computing the statistics comprises attributing an organic online reference to the identified entity identified via a first search engine of the plurality of search engines with a weight greater than an organic online reference to the entity identified via a second search engine; and
by the computing device, presenting the computed statistics regarding organic online references to the identified entity.
2. The method of claim 1 , further comprising wherein computing statistics regarding organic online references to the identified entity further comprises computing a search engine optimization score for the identified organic online reference to the entity based on the organic online references in the reverse index.
3. The method of claim 1 , wherein the reverse index lists the social online reference and the organic online reference.
4. The method of claim 1 , wherein the reverse index lists the organic online reference with respect to the keyword references by entity identifier.
5. The method of claim 1 3, wherein the social third signals comprise a rank, a uniform resource locator (URL), a title, or a description of the social online reference references.
6. The method of claim 1 , wherein the social third signals are evaluated to identify the relevance of the social online reference to the entity references to entities in the social network search results.
7. The method of claim 1 , wherein the organic first signals comprise a rank, a uniform resource locator (URL), a title, or a description of the organic online reference references.
8. The method of claim 1 , wherein the organics first signals are evaluated to identify the relevance of the organics organic online reference to the entity references to entities in the organic search results.
9. The method of claim 1 3, wherein the social networks comprise a social structure of nodes that are tied by one or more specific types of interdependency.
10. The method of claim 1 , wherein the organic search results are search results in a search engine results page that appear based on their relevance to a search term used to generate the search results further comprising identifying backlinks to the organic online references based on the search engine results pages.
11. The method of claim 1, further comprising:
determining, using the reverse index, paid online references to the identified entity.
12. The method of claim 1, further comprising determining a popularity of the organic online references based on the search engine results pages.
13. The method of claim 1, wherein the first signals are evaluated to assess relevance of the online channels to the identified entity.
14. The method of claim 1, wherein parsing the search engine results pages to determine organic search results and paid search results further comprises parsing the search results pages to identify which search results are organic results and which search results are paid results.
15. A method comprising:
receiving information identifying an entity, wherein the entity is an individual, corporation, brand, or product; determining a plurality of keywords based on the received information; retrieving search engine results pages from a plurality of search engines across a plurality of online channels using the plurality of keywords, wherein the search engine results pages comprise one or more search results identified by the respective search engine as responsive to the plurality of keywords; parsing the search engine results pages to determine organic search results identified by the search engine, wherein the organic search results appear in the search engine results pages based on relevance of web pages corresponding to the organic search results to the plurality of keywords; identifying organic online references to entities in the organic search results from the search engine results pages; parsing the organic search results to identify one or more first signals to be evaluated that include information about the organic online references; constructing, using a processor, a reverse index of online references on the web pages corresponding to the organic search results based on evaluating the identified organic online references and the first signals; computing, using the reverse index, statistics regarding organic online references to the identified entity, wherein computing the statistics comprises attributing an organic online reference to the identified entity identified via a first search engine of the plurality of search engines with a weight greater than an organic online reference to the entity identified via a second search engine; and presenting the computed statistics regarding organic online references to the identified entity.
16. The method of claim 15, wherein computing statistics regarding organic online references to the identified entity further comprises:
computing a score for the identified entity based on the organic online references in the reverse index.
17. The method of claim 15, wherein the reverse index lists the online references by entity identifier.
18. The method of claim 15, wherein the first signals comprise a rank, a uniform resource locator (URL), a title, or a description of the organic online references.
19. The method of claim 15, wherein the first signals are evaluated to identify relevance of the organic online references to entities in the organic search results.
20. The method of claim 15, further comprising identifying backlinks to the organic online references based on the search engine results pages.
21. The method of claim 15, further comprising determining a popularity of the organic online references based on the search engine results pages.
22. The method of claim 15, wherein the first signals are evaluated to assess relevance of the online channels to the identified entity.
23. The method of claim 15, wherein the parsing the search engine results pages to determine organic search results further comprises parsing the search results pages to identify which search results are organic results.
24. The method of claim 15, further comprising:
parsing the search engine results pages to determine paid search results; identifying paid online references to entities in the paid search results; parsing the paid search results to identify one or more second signals to be evaluated that include information about the paid online references; evaluating the second signals; and computing, using the reverse index, statistics regarding paid online references to the identified entity, wherein the reverse index of online references is further constructed based on the identified paid online references and the evaluated second signals.
25. The method of claim 24, wherein the parsing the search engine results pages to determine paid search results further comprises parsing the search engine results pages to identify which search results are paid results.
26. The method of claim 24, wherein computing statistics regarding paid online references to the identified entity further comprises:
computing a score for the identified entity based on the paid online references in the reverse index.
27. The method of claim 15, further comprising:
searching social networks for social network search results; identifying social online references to entities in the social network search results based on the social network search results; parsing the social network search results to identify one or more third signals to be evaluated that include information about the social online references to entities; evaluating the third signals; and computing, using the reverse index, statistics regarding social online references to the identified entity, wherein the reverse index of online references is further constructed based on the identified social online references and the evaluated third signals.
28. The method of claim 27, wherein the third signals are evaluated to identify relevance of the social online references to entities in the social network search results.
29. The method of claim 27, wherein the third signals comprise a rank, a uniform resource locator (URL), a title, or a description of the social online references.
30. The method of claim 27, wherein the social networks comprise a social structure of nodes that are tied by one or more specific types of interdependency.
31. A system comprising:
one or more processors; and a memory coupled to the processors comprising instructions executable by the processors, the processors being operable when executing the instructions to:
receive information identifying an entity, wherein the entity is an individual, corporation, brand, or product;
determine a plurality of keywords based on the received information;
retrieve search engine results pages from a plurality of search engines across a plurality of online channels using the plurality of keywords, wherein the search engine results pages comprise one or more search results identified by the respective search engine as responsive to the plurality of keywords;
parse the search engine results pages to determine organic search results identified by the search engine, wherein the organic search results appear in the search engine results pages based on relevance of web pages corresponding to the organic search results to the plurality of keywords;
identify organic online references to entities in the organic search results from the search engine results pages;
parse the organic search results to identify one or more first signals to be evaluated that include information about the organic online references;
construct, using a processor, a reverse index of organic online references on the web pages corresponding to the organic search results based on evaluating the identified organic online references and the first signals;
compute, using the reverse index, statistics regarding organic online references to the identified entity, wherein computing the statistics comprises attributing an organic online reference to the identified entity identified via a first search engine of the plurality of search engines with a weight greater than an organic online reference to the entity identified via a second search engine; and
present the computed statistics regarding organic online references to the identified entity.
32. The system of claim 31, the processors being further operable when executing the instructions to compute statistics regarding organic online references to the identified entity to:
compute a score for the identified entity based on the organic online references in the reverse index.
33. The system of claim 31, wherein the reverse index lists the online references by entity identifier.
34. The system of claim 31, wherein the first signals comprise a rank, a uniform resource locator (URL), a title, or a description of the organic online references.
35. The system of claim 31, wherein the first signals are evaluated to identify relevance of the organic online references to entities in the organic search results.
36. The system of claim 31, wherein the processors are further operable when executing the instructions to identify backlinks to the organic online references based on the search engine results pages.
37. The system of claim 31, wherein the processors are further operable when executing the instructions to determine a popularity of the organic online references based on the search engine results pages.
38. The system of claim 31, wherein the first signals are evaluated to assess relevance of the online channels to the identified entity.
39. The system of claim 31, wherein the processors being operable when executing the instructions to parse the search engine results pages to determine organic search results further comprise the processors being operable when executing the instructions to parse the search results pages to identify which search results are organic results.
40. The system of claim 31, the processors being further operable when executing the instructions to:
parse the search engine results pages to determine paid search results; identify paid online references to entities in the paid search results; parse the paid search results to identify one or more second signals to be evaluated that include information about the paid online references; evaluate the second signals; and compute, using the reverse index, statistics regarding paid online references to the identified entity, wherein the reverse index of online references is further constructed based on the identified paid online references and the evaluated second signals.
41. The system of claim 40, wherein the processors being operable when executing the instructions to parse the search engine results pages to determine paid search results further comprise the processors being operable when executing the instructions to parse the search engine results pages to identify which search results are paid results.
42. The system of claim 40, the processors being further operable when executing the instructions to compute statistics regarding paid online references to the identified entity to:
compute a score for the identified entity based on the paid online references in the reverse index.
43. The system of claim 31, the processors being further operable when executing the instructions to:
search social networks for social network search results; identify social online references to entities in the social network search results based on the social network search results; parse the social network search results to identify one or more third signals to be evaluated that include information about the social online references to entities; evaluate the third signals; and compute, using the reverse index, statistics regarding social online references to the identified entity, wherein the reverse index of online references is further constructed based on the identified social online references and the evaluated third signals.
44. The system of claim 43, wherein the third signals are evaluated to identify relevance of the social online references to entities in the social network search results.
45. The system of claim 43, wherein the third signals comprise a rank, a uniform resource locator (URL), a title, or a description of the social online references.
46. The system of claim 43, wherein the social networks comprise a social structure of nodes that are tied by one or more specific types of interdependency.
47. The method of claim 1, wherein the querying, parsing the search engine results pages, identifying, parsing the organic search results, evaluating, constructing, and computing steps are each performed by a respective worker node of the computing device, wherein each respective worker node is assigned one or more of the querying, parsing the search engine results pages, identifying, parsing the organic search results, evaluating, constructing, and computing steps via a job queue maintained by a deep index engine of the computing device.
48. The method of claim 15, wherein the receiving, retrieving, parsing the search engine results pages, identifying, parsing the organic search results, constructing, and computing steps are each performed by a respective worker node of a deep index engine, wherein each respective worker node is assigned one or more of the receiving, retrieving, parsing the search engine results pages, identifying, parsing the organic search results, constructing, and computing steps via a job queue maintained by the deep index engine.
49. The system of claim 31, wherein the receiving, retrieving, parsing the search engine results pages, identifying, parsing the organic search results, constructing, and computing steps are each performed by a respective worker node of the system, wherein each respective worker node is assigned one or more of the receiving, retrieving, parsing the search engine results pages, identifying, parsing the organic search results, constructing, and computing steps via a job queue maintained by a deep index engine of the system.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.