P
US9762505B2ActiveUtilityPatentIndex 50

Collaborative route reservation and ranking in high performance computing fabrics

Assignee: IBMPriority: Jan 7, 2014Filed: Jan 7, 2014Granted: Sep 12, 2017
Est. expiryJan 7, 2034(~7.5 yrs left)· nominal 20-yr term from priority
Inventors:CUDAK GARY DHARDEE CHRISTOPHER JJOHNSON JARROD BREESE BRYAN M
H04L 45/44H04L 45/16H04L 47/72H04L 47/826
50
PatentIndex Score
0
Cited by
7
References
13
Claims

Abstract

Embodiments of the present invention provide a method, system and computer program product for collaborative route reservation in an HPC fabric. A method for collaborative route reservation in an HPC fabric includes selecting a target node in a cluster of nodes to receive a payload from a source node of the cluster over an HPC fabric and computing a route over the HPC fabric for transferring the payload from the source node to the target node, and also a duration of time requisite to transferring the payload. The method also includes notifying other nodes in the cluster of a reservation of the computed route for the duration of time and utilizing the computed route during the duration of time to transfer the payload. Finally, the method includes responding to completing transfer of the payload by notifying the other nodes that the computed path is no longer reserved.

Claims

exact text as granted — not AI-modified
We claim: 
     
       1. A method for collaborative route reservation comprising:
 selecting a target node in a cluster of nodes in an high performance computing (HPC) data processing system to receive a payload from a source node of the cluster over an HPC fabric for the HPC data processing system; 
 computing a route over the HPC fabric for transferring the payload from the source node to the target node and also a duration of time requisite to transferring the payload from the source node to the target node, first by determining a selection of routes available to transfer the payload from the source node to the target node and second by selecting a route from amongst the selection of routes that has already been reserved by another of the nodes of the cluster, but has a shortest delay before the selected route is available to be reserved; 
 notifying others of the nodes in the cluster of a reservation of the computed route to the exclusion of other payloads of others of the nodes for the duration of time; 
 utilizing the computed route during the duration of time to transfer the payload from the source node to the target node; and, 
 responsive to completing transfer of the payload, notifying others of the nodes of the cluster that the computed path is no longer reserved. 
 
     
     
       2. The method of  claim 1 , wherein the duration of time is computed according to a size of the payload. 
     
     
       3. The method of  claim 1 , wherein the duration of time is computed according to a priority assigned to the payload. 
     
     
       4. The method of  claim 1 , wherein the duration of time is computed according to a known historical time required to transfer the payload. 
     
     
       5. A high performance computing (HPC) data processing system configured for collaborative route reservation, the system comprising:
 a cluster of host computers coupled to one another over a data communications fabric, each of the host computers including memory and at least one processor and acting as a node of the cluster; and, 
 a route reservation module comprising program code executing in the memory of a corresponding one of the host computers, the program code during execution in the memory of the corresponding one of the host computers selecting a target node in the cluster for receiving a payload from a source node of the cluster over the fabric, computing a route over the fabric for transferring the payload from the source node to the target node and also a duration of time requisite to transferring the payload from the source node to the target node, first by determining a selection of routes available to transfer the payload from the source node to the target node and second by selecting a route from amongst the selection of routes that has already been reserved by another of the nodes of the cluster, but has a shortest delay before the selected route is available to be reserved, notifying others of the nodes in the cluster of a reservation of the computed route to the exclusion of other payloads of others of the nodes for the duration of time, utilizing the computed route during the duration of time to transfer the payload from the source node to the target node, and responding to completing transfer of the payload by notifying others of the nodes of the cluster that the computed path is no longer reserved. 
 
     
     
       6. The system of  claim 5 , wherein the fabric is a mesh network coupling each of the host computers to one another without the benefit of an intelligent switch or router. 
     
     
       7. The system of  claim 5 , wherein the duration of time is computed according to a size of the payload. 
     
     
       8. The system of  claim 5 , wherein the duration of time is computed according to a priority assigned to the payload. 
     
     
       9. The system of  claim 5 , wherein the duration of time is computed according to a known historical time required to transfer the payload. 
     
     
       10. A computer program product for collaborative route reservation comprising:
 a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising: 
 computer readable program code for selecting a target node in a cluster of nodes in an high performance computing (HPC) data processing system to receive a payload from a source node of the cluster over an HPC fabric for the HPC data processing system; 
 computer readable program code for computing a route over the HPC fabric for transferring the payload from the source node to the target node and also a duration of time requisite to transferring the payload from the source node to the target node, first by determining a selection of routes available to transfer the payload from the source node to the target node and second by selecting a route from amongst the selection of routes that has already been reserved by another of the nodes of the cluster, but has a shortest delay before the selected route is available to be reserved; 
 computer readable program code for notifying others of the nodes in the cluster of a reservation of the computed route to the exclusion of other payloads of others of the nodes for the duration of time; 
 computer readable program code for utilizing the computed route during the duration of time to transfer the payload from the source node to the target node; and, 
 computer readable program code for responding to completing transfer of the payload by notifying others of the nodes of the cluster that the computed path is no longer reserved. 
 
     
     
       11. The computer program product of  claim 10 , wherein the duration of time is computed according to a size of the payload. 
     
     
       12. The computer program product of  claim 10 , wherein the duration of time is computed according to a priority assigned to the payload. 
     
     
       13. The computer program product of  claim 10 , wherein the duration of time is computed according to a known historical time required to transfer the payload.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.