P
US9846732B2ActiveUtilityPatentIndex 68

Communicating with data storage systems

Assignee: FARVER JENNIFER MPriority: Feb 13, 2009Filed: Feb 12, 2010Granted: Dec 19, 2017
Est. expiryFeb 13, 2029(~2.6 yrs left)· nominal 20-yr term from priority
Inventors:FARVER JENNIFER MTHOMAS BENVIGNEAU JOYCE LFOURNIER DAVIDFISHER BENFERNANDEZ GARY
G06F 17/30572H04L 67/36H04L 67/75G06F 16/26G06F 3/147G06F 3/14G06F 3/0481
68
PatentIndex Score
2
Cited by
107
References
59
Claims

Abstract

In some aspects, a method includes connecting over a network to a data storage system, the data storage system storing data objects. A dataflow graph that includes nodes representing data processing components connected by links that represent flows of data access an interface of the data storage system. The interface provides functions for accessing the data objects. At least one of the data processing components performs operations on a received input flow of data that enable the functions provided by the interface to modify one or more stored data objects, and performs operations in response to functions provided by the interface to generate an output flow of data.

Claims

exact text as granted — not AI-modified
What is claimed is: 
     
       1. A method, including:
 connecting, over a network interface, to a data storage system, the data storage system storing data objects; 
 accessing, by a computer system, an interface of the data storage system, the interface providing functions that include (1) a first function that provides access to the data objects already stored in the data storage system, and (2) a second function that creates new data objects to be stored in the data storage system; 
 selecting, by the computer system, fields based on one or more rules associated with at least one of the functions provided by the interface, the rules being received from the data storage system and the rules including at least one of (1) a first rule that prevents a field from being selected if that field does not satisfy a validity constraint on that field or on a data object associated with that field, or (2) a second rule that prevents or allows a field to be selected according to a type of a data object associated with that field; 
 displaying at least one of the selected fields in a graphical user interface; 
 receiving an identification of at least one of the displayed fields from a user of the graphical user interface; and 
 processing data records in a data processing environment that is in communication with the data storage system, the processing including:
 accessing, by one or more data processing components, the first function via response messages from the interface of the data storage system, at least some response messages each including multiple data objects representing data of the identified fields, at least some response messages including one or more indicators of successful function execution, and at least some response messages including one or more indicators of failed function execution; 
 processing batches of partial results where each batch of partial results includes a response message that includes a different set of multiple data objects; 
 transforming each response message including an indicator of successful function execution into multiple data records, by at least one data processing component of the one or more data processing components, where each data record includes data from a respective data object; and 
 transferring at least some of the data records between an output port of a data processing component and an input port of a data processing components, the transferring including forwarding data associated with failed function execution to a first input port and forwarding data associated with successful function execution to a second input port. 
 
 
     
     
       2. The method of  claim 1 , wherein the graphical user interface receives input from the user and provides formatting information to at least one of the data processing components, the formatting information defining a format of data records according to one or more fields associated with the data objects, where data records formatted according to the defined format are compatible with the operations performed by the data processing components. 
     
     
       3. The method of  claim 1 , wherein the graphical user interface displays only data objects or fields that satisfy a validity constraint defined by the rules. 
     
     
       4. The method of  claim 1 , wherein one or more of the data objects or one or more of the fields displayed by the graphical user interface are selectable. 
     
     
       5. The method of  claim 4 , wherein data objects or fields that satisfy a validity constraint defined by the rules are automatically selected by the graphical user interface. 
     
     
       6. The method of  claim 5 , wherein the graphical user interface prevents a user from de-selecting data objects and fields that have been automatically displayed as selected. 
     
     
       7. The method of  claim 1 , wherein at least one of the data processing components generates a request message to be sent to the data storage system. 
     
     
       8. The method of  claim 7 , wherein at least one of the data processing components transforms an input having multiple input data records into a single request message. 
     
     
       9. The method of  claim 7 , wherein at least one of the data processing components generates the request message based on input parameters representing flows of data. 
     
     
       10. The method of  claim 9 , wherein a user alters the input parameters of the data processing components through a metadata browser. 
     
     
       11. The method of  claim 1 , wherein at least one of the data processing components separates data associated with successful function execution and data associated with failed function execution. 
     
     
       12. The method of  claim 1 , wherein connecting to the data storage system includes:
 transmitting a login request from at least one of the data processing components to the data storage system; 
 logging in to the data storage system to obtain session credentials; 
 storing the session credentials; and 
 encoding the stored session credentials into one or more login requests. 
 
     
     
       13. The method of  claim 12 , wherein the login request is transmitted to an internal gateway. 
     
     
       14. The method of  claim 12 , wherein the stored session credentials are encoded into a plurality of concurrent login requests. 
     
     
       15. The method of  claim 1 , wherein processing batches of partial results further includes:
 receiving, by a first data processing component, the batches of partial results from the data storage system; and 
 providing, by the first data processing component while the first data processing component continues to receive batches of partial results, at least some of the partial results to a second data processing component. 
 
     
     
       16. The method of  claim 15 , wherein the second processing component generates an output data flow of data records based at least in part on the received partial results. 
     
     
       17. The method of  claim 1 , wherein at least one of the data processing components includes one or more ports. 
     
     
       18. The method of  claim 17 , wherein the one or more ports include at least one of: an input port configured to receive an input flow of data, and an output port configured to pass an output flow of data. 
     
     
       19. The method of  claim 18 , wherein the input flow of data includes a plurality of data records. 
     
     
       20. The method of  claim 18 , wherein the output flow of data includes a plurality of data records. 
     
     
       21. The method of  claim 18 , wherein one or both of the input flow of data and the output flow of data includes a plurality of data records. 
     
     
       22. The method of  claim 1 , wherein the one or more data processing components are represented by nodes in a dataflow graph that are connected by links that represent flows of data. 
     
     
       23. The method of  claim 22 , wherein the data processing environment provides pipeline parallelism to enable downstream data processing components to execute in parallel with upstream data processing components, according to the dataflow graph. 
     
     
       24. The method of  claim 1 , wherein the data processing environment comprises a data flow processing environment. 
     
     
       25. The method of  claim 24 , wherein at least one data processing component of the one or more data processing components processes at least some of the multiple data records in parallel. 
     
     
       26. The method of  claim 24 , wherein at least one of the data processing components generates multiple request messages in parallel, at least some of which are sent concurrently to the data storage system. 
     
     
       27. The method of  claim 24 , wherein the data flow processing environment provides pipeline parallelism to enable downstream data processing components to execute in parallel with upstream data processing components. 
     
     
       28. The method of  claim 18 , wherein generating the output flow of data includes performing operations on one or more batches of partial results received from the data storage system. 
     
     
       29. A system, including:
 a network interface including circuitry that connects to a data storage system, the data storage system storing data objects; 
 a computer system rendering a graphical user interface and configured to:
 access an interface of the data storage system, the interface providing functions that include (1) a first function that provides access to the data objects already stored in the data storage system, and (2) a second function that creates new data objects to be stored in the data storage system; 
 select fields based on one or more rules associated with at least one of the functions provided by the interface, the rules being received from the data storage system and the rules including at least one of (1) a first rule that prevents a field from being selected if that field does not satisfy a validity constraint on that field or on a data object associated with that field, or (2) a second rule that prevents or allows a field to be selected according to a type of a data object associated with that field; 
 display at least one of the selected fields in the graphical user interface; and 
 receive an identification of at least one of the displayed fields from a user of the graphical user interface; and 
 
 a data processing environment for processing data records in communication with the data storage system, the data processing environment including at least one processor configured to:
 access, by one or more data processing components, the first function via response messages from the interface of the data storage system, at least some response messages each including multiple data objects representing data of the identified fields, at least some response messages including one or more indicators of successful function execution, and at least some response messages including one or more indicators of failed function execution; and 
 process batches of partial results where each batch of partial results includes a response message that includes a different set of multiple data objects; 
 transform each response message including an indicator of successful function execution into multiple data records, by at least one data processing component of the one or more data processing components, where each data record includes data from a respective data object; and 
 transfer at least some of the data records between an output port of a data processing component and an input port of a data processing components, the transferring including forwarding data associated with failed function execution to a first input port and forwarding data associated with successful function execution to a second input port. 
 
 
     
     
       30. The system of  claim 29 , wherein the graphical user interface displays only data objects or fields that satisfy a validity constraint defined by the rules. 
     
     
       31. The system of  claim 29 , wherein one or more of the data objects or one or more of the fields displayed by the graphical user interface are selectable. 
     
     
       32. The system of  claim 31 , wherein data objects or fields that satisfy a validity constraint defined by the rules are automatically selected by the graphical user interface. 
     
     
       33. The system of  claim 32 , wherein the graphical user interface prevents a user from de-selecting data objects and fields that have been automatically displayed as selected. 
     
     
       34. The system of  claim 29 , wherein at least one of the data processing components generates a request message to be sent to the data storage system. 
     
     
       35. The system of  claim 34 , wherein at least one of the data processing components transforms an input having multiple input data records into a single request message. 
     
     
       36. The system of  claim 34 , wherein at least one of the data processing components generates the request message based on input parameters representing flows of data. 
     
     
       37. The system of  claim 29 , wherein at least one of the data processing components separates data associated with successful function execution and data associated with of failed function execution. 
     
     
       38. The system of  claim 29 , wherein processing batches of partial results further includes:
 receiving, by a first data processing component, the batches of partial results from the data storage system; and 
 providing, by the first data processing component while the first data processing component continues to receive batches of partial results, at least some of the partial results to a second data processing component. 
 
     
     
       39. The system of  claim 38 , wherein the second processing component generates an output data flow of data records based at least in part on the received partial results. 
     
     
       40. The system of  claim 29 , wherein the data processing environment comprises a data flow processing environment. 
     
     
       41. The system of  claim 40 , wherein at least one data processing component of the one or more data processing components processes at least some of the multiple data records in parallel. 
     
     
       42. The system of  claim 40 , wherein at least one of the data processing components generates multiple request messages in parallel, at least some of which are sent concurrently to the data storage system. 
     
     
       43. The system of  claim 40 , wherein the data flow processing environment provides pipeline parallelism to enable downstream data processing components to execute in parallel with upstream data processing components. 
     
     
       44. A system, including:
 means for connecting over a network to a data storage system, the data storage system storing data objects; 
 means for managing a graphical user interface, the managing including:
 accessing an interface of the data storage system, the interface providing functions that include (1) a first function that provides access to the data objects already stored in the data storage system, and (2) a second function that creates new data objects to be stored in the data storage system; 
 selecting fields based on one or more rules associated with at least one of the functions provided by the interface, the rules being received from the data storage system and the rules including at least one of (1) a first rule that prevents a field from being selected if that field does not satisfy a validity constraint on that field or on a data object associated with that field, or (2) a second rule that prevents or allows a field to be selected according to a type of a data object associated with that field; 
 displaying at least one of the selected fields in the graphical user interface; and 
 receiving an identification of at least one of the displayed fields from a user of the graphical user interface; and 
 
 means for processing data records in a data processing environment that is in communication with the data storage system, the processing including:
 accessing, by one or more data processing components, the first function via response messages from the interface of the data storage system, at least some response messages each including multiple data objects representing data of the identified fields, at least some response messages including one or more indicators of successful function execution, and at least some response messages including one or more indicators of failed function execution; 
 processing batches of partial results where each batch of partial results includes a response message that includes a different set of multiple data objects; 
 transforming each response message including an indicator of successful function execution into multiple data records, by at least one data processing component of the one or more data processing components, where each data record includes data from a respective data object; and 
 transferring at least some of the data records between an output port of a data processing component and an input port of a data processing components, the transferring including forwarding data associated with failed function execution to a first input port and forwarding data associated with successful function execution to a second input port. 
 
 
     
     
       45. A non-transitory computer-readable storage medium storing a computer program, the computer program including instructions for causing a computer system to:
 connect, over a network interface, to a data storage system, the data storage system storing data objects; 
 access an interface of the data storage system, the interface providing functions that include (1) a first function that provides access to the data objects already stored in the data storage system, and (2) a second function that creates new data objects to be stored in the data storage system; 
 select fields based on one or more rules associated with at least one of the functions provided by the interface, the rules being received from the data storage system and the rules including at least one of (1) a first rule that prevents a field from being selected if that field does not satisfy a validity constraint on that field or on a data object associated with that field, or (2) a second rule that prevents or allows a field to be selected according to a type of a data object associated with that field; 
 display at least one of the selected fields in a graphical user interface; 
 receive an identification of at least one of the displayed fields from a user of the graphical user interface; and 
 process data records in a data processing environment, the processing including:
 accessing, by one or more data processing components, the first function via response messages from the interface of the data storage system, at least some response messages each including multiple data objects representing data of the identified fields, at least some response messages including one or more indicators of successful function execution, and at least some response messages including one or more indicators of failed function execution; 
 processing batches of partial results where each batch of partial results includes a response message that includes a different set of multiple data objects; 
 transforming each response message including an indicator of successful function execution into multiple data records, by at least one data processing component of the one or more data processing components, where each data record includes data from a respective data object; and 
 transferring at least some of the data records between an output port of a data processing component and an input port of a data processing components, the transferring including forwarding data associated with failed function execution to a first input port and forwarding data associated with successful function execution to a second input port. 
 
 
     
     
       46. The non-transitory computer-readable storage medium of  claim 45 , wherein the graphical user interface displays only data objects or fields that satisfy a validity constraint defined by the rules. 
     
     
       47. The non-transitory computer-readable storage medium of  claim 45 , wherein one or more of the data objects or one or more of the fields displayed by the graphical user interface are selectable. 
     
     
       48. The non-transitory computer-readable storage medium of  claim 47 , wherein data objects or fields that satisfy a validity constraint defined by the rules are automatically selected by the graphical user interface. 
     
     
       49. The non-transitory computer-readable storage medium of  claim 48 , wherein the graphical user interface prevents a user from de-selecting data objects and fields that have been automatically displayed as selected. 
     
     
       50. The non-transitory computer-readable storage medium of  claim 45 , wherein at least one of the data processing components generates a request message to be sent to the data storage system. 
     
     
       51. The non-transitory computer-readable storage medium of  claim 50 , wherein at least one of the data processing components transforms an input having multiple input data records into a single request message. 
     
     
       52. The non-transitory computer-readable storage medium of  claim 50 , wherein at least one of the data processing components generates the request message based on input parameters representing flows of data. 
     
     
       53. The non-transitory computer-readable storage medium of  claim 45 , wherein at least one of the data processing components separates data associated with successful function execution and data associated with of failed function execution. 
     
     
       54. The non-transitory computer-readable storage medium of  claim 45 , wherein processing batches of partial results further includes:
 receiving, by a first data processing component, the batches of partial results from the data storage system; and 
 providing, by the first data processing component while the first data processing component continues to receive batches of partial results, at least some of the partial results to a second data processing component. 
 
     
     
       55. The non-transitory computer-readable storage medium of  claim 54 , wherein the second processing component generates an output data flow of data records based at least in part on the received partial results. 
     
     
       56. The non-transitory computer-readable storage medium of  claim 45 , wherein the data processing environment comprises a data flow processing environment. 
     
     
       57. The non-transitory computer-readable storage medium of  claim 56 , wherein at least one data processing component of the one or more data processing components processes at least some of the multiple data records in parallel. 
     
     
       58. The non-transitory computer-readable storage medium of  claim 56 , wherein at least one of the data processing components generates multiple request messages in parallel, at least some of which are sent concurrently to the data storage system. 
     
     
       59. The non-transitory computer-readable storage medium of  claim 56 , wherein the data flow processing environment provides pipeline parallelism to enable downstream data processing components to execute in parallel with upstream data processing components.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.