US10216587B2ActiveUtilityPatentIndex 61

Scalable fault tolerant support in a containerized environment

Assignee: IBMPriority: Oct 21, 2016Filed: Oct 21, 2016Granted: Feb 26, 2019

Est. expiryOct 21, 2036(~10.3 yrs left)· nominal 20-yr term from priority

Inventors:HASANOV KHALID LEMARINIER PIERRE RAFIQUE MUHAMMAD M VENUGOPAL SRIKUMAR

G06F 11/1471G06F 2201/82G06F 2201/84G06F 2201/805G06F 11/142G06F 11/1482G06F 11/1438

PatentIndex Score

Cited by

References

Claims

Abstract

Embodiments for providing failure tolerance to containerized applications by one or more processors. A layered filesystem is initialized to maintain checkpoint information of stateful processes in separate and exclusive layers on individual containers. A most recent checkpoint layer is transferred from a main container exclusively to an additional node to maintain an additional, shadow container.

Claims

exact text as granted — not AI-modified

The invention claimed is: 
     
       1. A method for providing failure tolerance to containerized applications by one or more processors, comprising:
 initializing a layered filesystem to maintain checkpoint information of stateful processes in separate and exclusive layers on individual containers; 
 transferring a most recent checkpoint layer from a main container exclusively to an additional node to maintain an additional, shadow container; 
 implementing a maintenance schedule for the main and shadow containers, including transferring additional checkpoint layers at regular intervals; and 
 organizing the most recent checkpoint layer and additional layers such that the most recent checkpoint layer is a topmost layer. 
 
     
     
       2. The method of  claim 1 , further including starting a failed process from the most recent checkpoint layer on the shadow container. 
     
     
       3. The method of  claim 1 , further including upon starting one of the containerized applications, determining whether one of the most recent checkpoint layer or additional checkpoint layers exists locally on the main container, otherwise loading the most recent checkpoint layer from the shadow container on the additional node. 
     
     
       4. The method of  claim 1 , further including initializing a filesystem layer service (FLS) that:
 determines, following a failure of the main container, which node to execute the shadow container, or 
 signals the availability of a new checkpoint layer to the additional node. 
 
     
     
       5. The method of  claim 4 , further including, subsequent to executing the shadow container, orchestrating a local copy of the most recent checkpoint layer on the node in which the shadow container is executed. 
     
     
       6. A system for providing failure tolerance to containerized applications, comprising:
 one or more processors, that:
 initialize a layered filesystem to maintain checkpoint information of stateful processes in separate and exclusive layers on individual containers, 
 transfer a most recent checkpoint layer from a main container exclusively to an additional node to maintain an additional, shadow container, 
 implement a maintenance schedule for the main and shadow containers, including transferring additional checkpoint layers at regular intervals, and 
 organize the checkpoint layer and additional layers such that the most recent checkpoint layer is a topmost layer. 
 
 
     
     
       7. The system of  claim 6 , wherein the one or more processors start a failed process from the most recent checkpoint layer on the shadow container. 
     
     
       8. The system of  claim 6 , wherein the one or more processors, upon starting one of the containerized applications, determining whether one of the most recent checkpoint layer or additional checkpoint layers exists locally on the main container, otherwise loading the most recent checkpoint layer from the shadow container on the additional node. 
     
     
       9. The system of  claim 6 , wherein the one or more processors initialize a filesystem layer service (FLS) that:
 determines, following a failure of the main container, which node to execute the shadow container, or 
 signals the availability of a new checkpoint layer to the additional node. 
 
     
     
       10. The system of  claim 9 , wherein the one or more processors, subsequent to executing the shadow container, orchestrate a local copy of the most recent checkpoint layer on the node in which the shadow container is executed. 
     
     
       11. A computer program product for providing failure tolerance to containerized applications by one or more processors, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising:
 an executable portion that initializes a layered filesystem to maintain checkpoint information of stateful processes in separate and exclusive layers on individual containers; 
 an executable portion that transfers a most recent checkpoint layer from a main container exclusively to an additional node to maintain an additional, shadow container; 
 an executable portion that implements a maintenance schedule for the main and shadow containers, including transferring additional checkpoint layers at regular intervals; and 
 an executable portion that organizes the most recent checkpoint layer and additional layers such that the most recent checkpoint layer is a topmost layer. 
 
     
     
       12. The computer program product of  claim 11 , further including an executable portion that starts a failed process from the stored checkpoint layer on the shadow container. 
     
     
       13. The computer program product of  claim 11 , further including an executable portion that, upon starting one of the containerized applications, determines whether one of the most recent checkpoint layer or the additional checkpoint layers exists locally on the main container, otherwise loading the checkpoint layer from the shadow container on the additional node. 
     
     
       14. The computer program product of  claim 11 , further including an executable portion that initializes a filesystem layer service (FLS) that:
 determines, following a failure of the main container, which node to execute the shadow container, or 
 signals the availability of a new checkpoint layer to the additional node. 
 
     
     
       15. The computer program product of  claim 14 , further including an executable portion that, subsequent to executing the shadow container, orchestrates a local copy of the most recent checkpoint layer on the node in which the shadow container is executed.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.