Updating a neural network model on a computation device
Abstract
A method for updating a neural network model on a computation device. The method includes estimating a bandwidth for data download from a server device to the computation device; estimating a time point of available computation capacity of the computation device; computing a maximum partition size as a function of the bandwidth and the time point; and causing download of a selected partition of the neural network model from the server device to the computation device, the selected partition being determined based on the maximum partition size. The method enables the selected partition to be executed by the computation device upon downloading and thereby provides for seamless updating of the neural network model while it is being executed on the computation device.
Claims
exact text as granted — not AI-modifiedWhat is claimed is:
1 . A method of updating a neural network model on a computation device, said method comprising:
estimating a bandwidth for data download from a server device to the computation device; estimating a time point of available computation capacity of the computation device; computing a maximum partition size as a function of the bandwidth and the time point; and causing download of a selected partition of the neural network model from the server device to the computation device, said selected partition being determined based on the maximum partition size; and updating and executing the selected partition of the neural network model, downloaded from the server, on the computation device, wherein updating the selected partition includes replacing a corresponding partition of the neural network model on the computation device with the selected partition downloaded from the server device, and wherein the method is performed while the neural network model is being executed on the computation device.
2 . The method of claim 1 , wherein the selected partition is determined to have a size substantially equal to or less than the maximum partition size.
3 . The method of claim 1 , wherein execution of the selected partition is initiated at the time point of available computation capacity.
4 . The method of claim 1 , wherein the maximum partition size is computed as a function of a product of the bandwidth and a time interval from a selected time to the time point.
5 . The method of claim 1 , wherein one or more existing partitions of the neural network model have been executed on the computation device at said time point.
6 . The method of claim 5 , wherein the selected partition is determined as a function of the one or more existing partitions.
7 . The method of claim 5 , wherein the selected partition is determined as a function of a dependence within the neural network model on output generated by the one or more existing partitions.
8 . The method of claim 5 , wherein the selected partition is determined so as to operate on the output generated by the one or more existing partitions.
9 . The method of claim 5 , further comprising: evaluating the output generated by the one or more existing partitions to identify one or more partitions to be excluded from execution, wherein the selected partition is determined while excluding the one or more partitions.
10 . The method of claim 5 , wherein said causing the download comprises: transmitting, by the computation device to the server device, size data indicative of the maximum partition size and status data indicative of the one or more existing partitions that have been executed at the time point.
11 . The method of claim 1 , further comprising: determining the selected partition by the computation device, wherein said causing the download comprises: transmitting, by the computation device to the server device, data indicative of the selected partition.
12 . The method of claim 1 , wherein the selected partition is determined as a function of the available computation capacity of the computation device at the time point.
13 . The method of claim 1 , wherein the selected partition is determined among a plurality of predefined partitions of the neural network model and based on a predefined dependence between the predefined partitions.
14 . The method of claim 1 , wherein the selected partition is determined by dynamically partitioning the neural network model on demand.
15 . The method of claim 1 , which is performed by the computation device.
16 . The method of claim 1 , which is repeatedly performed at consecutive current time points to update the neural network model on the computation device, and wherein said causing, at a respective current time point, results in download of partition data of a size substantially equal to the maximum partition size estimated at the respective current time point, said partition data comprising the selected partition, and optionally one or more further selected partitions.
17 . A computation device comprising a communication circuit for communicating with a server device, and logic to control the computation device to perform the method in accordance with claim 1 .
18 . A non-transitory computer-readable medium comprising computer instructions which, when executed by a processing system, cause the processing system to perform the method in accordance with claim 1 .Cited by (0)
No later patents cite this yet.
References (0)
No backward citations on record.