System and method for creating search index on cloud database
Abstract
A method for creating a search index on cloud database is provided. The method enables providing inputs for creating multiple indexes on documents stored in the cloud database. One of the inputs may include a first value representing number of documents to be assigned a single index. The method further enables determining total number of documents stored in the cloud database which is represented by a second value. Further, the method enables estimating total number of indexes to be created based on first value and second value. The method further comprises executing a loop to create multiple indexes for a predetermined number of iterations which corresponds to the estimated value. Furthermore, the method comprises indexing documents for creating the multiple indexes. Finally, the method comprises merging the multiple indexes to create a single index which facilitates a user to search documents stored in the cloud database.
Claims
exact text as granted — not AI-modifiedThe invention claimed is:
1. A computer implemented method for creating a search index on cloud database, the method comprising the steps of:
providing one or more inputs for creating a plurality of indexes on documents stored in the cloud database, wherein the one or more inputs include at least in part a first value representing number of documents to be assigned a single index;
determining total number of documents stored in the cloud database, wherein the total number of documents is represented by a second value;
estimating total number of indexes to be created based on the first value and the second value;
executing a loop to create plurality of indexes on documents for a predetermined number of iterations, wherein the predetermined number of iterations correspond to the estimated value; and
merging the plurality of indexes to create a single index, wherein the single index facilitates a user to search the documents stored in the cloud database.
2. The computer implemented method of claim 1 further comprising:
retrieving one or more documents from the cloud database based on the iterations; and
indexing one or more documents for creating the plurality of indexes.
3. The computer implemented method of claim 1 further comprising providing one or more inputs for creating at least one index on documents stored in the cloud database.
4. The computer implemented method of claim 1 , wherein the first value representing number of documents to be assigned a single index is determined based on identifying at least one of total number of documents stored in the database or total number of subsets available in the database or the total number of threads to be initiated by a thread pool or combinations thereof.
5. The computer implemented method of claim 1 , wherein the step of providing one or more inputs comprises the steps of:
providing a database name corresponding to least one database for which a plurality of indexes are to be created; and
providing a directory path corresponding to a specific location on a cloud database for storing the plurality of indexed documents.
6. The computer implemented method of claim 5 , wherein the step of determining total number of documents stored in the cloud database comprises accessing the database using the database name and calculating number of documents stored in the database.
7. The computer implemented method of claim 1 , wherein the loop includes one or more inputs, the one or more inputs being the database name representing the database, start key and end key associated with the documents stored in the database.
8. The computer implemented method of claim 2 , wherein the step of retrieving documents from the cloud database based on the iteration comprises retrieving all the documents tagged with keys that are stored between start key and end key, each time the loop is executed.
9. The computer implemented method of claim 2 , wherein the step of indexing the retrieved documents for creating a plurality of indexes comprises:
reading content of each of the documents; and
processing the documents in parallel for carrying out indexing of the documents to create the plurality of indexed documents; and
storing the plurality of indexed documents in a specific location in a database using the directory path information.
10. A system for creating a search index on cloud database, the system comprising:
a cloud database comprising one or more databases;
an indexing engine in communication with the cloud database via a processing engine and configured to facilitate searching and indexing documents stored in the cloud database; and
an index generator in communication with the cloud database and the indexing engine via the processing engine and configured to creating a single index on documents stored in the cloud database, wherein the single index facilitates a user to search the documents stored in the database, the index generator comprising:
a pre-processing module configured to facilitate a user to provide one or more inputs for creating a plurality of indexes on documents stored in the cloud database via an interface, the one or more inputs comprising a first value representing number of documents to be assigned a single index;
a fetching module configured to retrieve all the documents stored in the database and calculate the total number of documents, wherein the total number of documents is represented by a second value;
an estimation module configured to receive the first value from the pre-processing module, receive the second value from the fetching module and estimate the total number of indexes to be created based on the first value and the second value;
an execution module configured to receive the estimated value from the estimation module, execute a loop for creating the plurality of indexes on documents for a predetermined number of iterations, the predetermined number of iterations corresponding to the estimated value; and
a generation module configured to merge the indexed documents to create a single index.
11. The system of claim 10 , wherein the index generator is configured to:
create a plurality of indexes on the documents stored in the cloud database using the indexing engine;
merge the plurality of indexes into a single index; and
store the single index in a specific location in a database.
12. The system of claim 10 , wherein the indexing engine comprises a lucene engine.
13. The system of claim 10 , wherein the execution module is configured to:
retrieve the documents from the documents corresponding to each iteration;
index the documents, using the indexing engine, resulting in creation of the plurality of indexes; and
store the plurality of indexed documents.
14. A computer program product comprising a non-transitory computer-readable medium having computer-executable instructions stored thereon, that, when executed by a processor, cause the processor to perform a method, the method comprising:
providing one or more inputs for creating a plurality of indexes on documents stored in the cloud database, wherein the one or more inputs include at least in part a first value representing number of documents to be assigned a single index;
determining total number of documents stored in the cloud database, wherein the total number of documents is represented by a second value;
estimating total number of indexes to be created based on the first value and the second value;
executing a loop to create plurality of indexes on documents for a predetermined number of iterations, wherein the predetermined number of iterations correspond to the estimated value; and
merging the plurality of indexes to create a single index, wherein the single index facilitates a user to search the documents stored in the cloud database.Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.