USRE49461EActiveUtilityPatentIndex 59

Graphic processor based accelerator system and method

Assignee: NEURALA INCPriority: Sep 25, 2006Filed: Dec 29, 2020Granted: Mar 14, 2023

Est. expirySep 25, 2026(~0.2 yrs left)· nominal 20-yr term from priority

Inventors:GORCHETCHNIKOV ANATOLI AMES HEATHER MARIE VERSACE MASSIMILIANO SANTINI FABRIZIO

G06F 9/5027G06N 3/063G06N 20/00G06T 1/60G06T 1/20G06F 2209/509

PatentIndex Score

Cited by

213

References

Claims

Abstract

An accelerator system is implemented on an expansion card comprising a printed circuit board having (a) one or more graphics processing units (GPUs), (b) two or more associated memory banks (logically or physically partitioned), (c) a specialized controller, and (d) a local bus providing signal coupling compatible with the PCI industry standards. The controller handles most of the primitive operations to set up and control GPU computation. Thus, the computer's central processing unit (CPU) can be dedicated to other tasks. In this case a few controls (simulation start and stop signals from the CPU and the simulation completion signal back to CPU), GPU programs and input/output data are exchanged between CPU and the expansion card. Moreover, since on every time step of the simulation the results from the previous time step are used but not changed, the results are preferably transferred back to CPU in parallel with the computation.

Claims

exact text as granted — not AI-modified

What is claimed is:

1. A computer system, comprising:
a central processing unit to receive input data; main memory, operably coupled to the central processing unit via a bus, to store the input data received by the central processing unit; an accelerator, operably coupled to the central processing unit and the first memory via the bus, to receive at least a portion of the input data from the main memory, the accelerator comprising:
at least one graphics processing unit to perform a sequence of computations on the at least a portion of the input data so as to generate output data, intermediate computations in the sequence of computations yielding intermediate results; and
accelerator memory, operably coupled to the graphic processing unit, to store the results of the plurality of sequential computations; and
a controller, operably coupled to the at least one graphics processing unit and the accelerator memory, to transfer the at least a portion of the input data into the accelerator memory, and to transfer at least a portion of the output data from the accelerator memory to the main memory during performance of the sequence of computations by the at least one graphic processing unit.

2. The computer system of claim 1 , wherein the central processing unit is configured to receive the input data in response to a user interaction.

3. The computer system of claim 1 , wherein:
the central processing unit is configured to receive the input data at a first rate; and the at least one graphics processing unit is configured to perform the sequence of computations at a second rate different than the first rate.

4. The computer system of claim 1 , wherein the main memory is configured to store a copy of the output data stored in the accelerator memory.

5. The computer system of claim 1 , wherein an output of at least one computation in the sequence of computations represents an output of at least one neuron in an artificial neural network.

6. The computer system of claim 1 , wherein accelerator memory comprises:
a first memory bank to store parameters common to all of the computations in the sequence of computations; and a second memory bank to store data specific to at least one computation in the sequence of computations.

7. The computer system of claim 1 , wherein the controller is configured to transfer the output data from the accelerator memory to the main memory without transferring any of the intermediate results from the accelerator memory to the main memory so as to reduce data transfer via the bus.

8. The computer system of claim 1 , wherein the controller is configured to transfer at least a portion of the output data from the accelerator memory to the main memory after the at least one graphics processing unit has begun to perform another sequence of computations.

9. The computer system of claim 8 , wherein the controller is configured to initiate transfer of the at least a portion of the input data and to transfer the at least a portion of the output data in parallel with performance of at least one computation in the other sequence of computations by the at least one graphics processing unit.

10. The computer system of claim 1 , wherein the controller is configured to control execution of the sequence of computations by the at least one graphics processing unit.

11. The computer system of claim 1 , further comprising:
at least one of a video camera, a microphone, or a cell recording electrode, operably coupled to the central processor unit, to acquire the input data in real time.

12. A method of performing a sequence of computations on a computer system comprising a central processing unit (CPU), a main memory operably coupled to the central processing unit via a bus, an accelerator operably coupled to the CPU and the main memory via the bus, the accelerator comprising a graphics processing unit (GPU) and an accelerator memory, the method comprising:
(A) performing, by the GPU, the sequence of computations on a first portion of the input data so as to generate a first portion of the output data, intermediate computations in the sequence of computations yielding intermediate results; (B) in parallel with performing the sequence of computations by the GPU in (A), transferring a second portion of the input data from the main memory to the accelerator via the bus; and (C) in parallel with performing the sequence of computations by the GPU in (A), transferring a second portion of the output data from the accelerator memory to the main memory via the bus.

13. The method of claim 12 , further comprising:
storing the input data in the main memory in response to a user interaction.

14. The method of claim 12 , further comprising:
receiving the input data at a first rate; and wherein (A) comprises performing the sequence of computations at a second rate different than the first rate.

15. The method of claim 12 , wherein (A) comprises:
generating an output representative of an output of at least one neuron in an artificial neural network.

16. The method of claim 12 , wherein (C) comprises:
transferring the second portion of the output data from the accelerator memory to the main memory without transferring any of the intermediate results of the plurality of sequential computations from the accelerator memory to the main memory so as to reduce data transfer via the bus.

17. The method of claim 12 , wherein (C) comprises:
transferring the second portion of the output data from the accelerator memory to the main memory after the GPU has begun to perform another sequence of computations.

18. The method of claim 17 , wherein (C) further comprises:
initiating transfer of the second portion of the output data in parallel with performance of at least one computation in the other sequence of computations.

19. The method of claim 12 , further comprising:
acquiring the input data in real time with at least one of a video camera, a microphone, or a cell recording electrode operably coupled to the CPU.

20. The method of claim 12 , further comprising:
storing parameters common to all of the computations in the sequence of computations in a first memory bank in the accelerator memory; and storing data specific to at least one computation in the sequence of computations in a second memory bank in the accelerator memory.

21. A method of executing computations representing an artificial neural network on a computer system comprising at least one central processing unit (CPU), a processing unit, a first memory partition, and a second memory partition, the method comprising:
executing, by the at least one CPU, a user interaction stream, the user interaction stream controlling transfer of inputs to the artificial neural network to the first memory partition and the second memory partition; executing, by the processing unit, a computational stream, the computational stream controlling data exchange between the user interaction stream and the computational stream during execution of the computations representing the artificial neural network; shifting control of a data exchange between the user interaction stream and the computational stream to the computational stream in response to starting execution of the computations representing the artificial neural network; shifting control of the data exchange between the user interaction stream and the computational stream to the user interaction stream in response to completion or interruption of the computations representing the artificial neural network; queueing a user command received by the user interaction stream during execution of the computations representing the artificial neural network; and executing the user command during execution of the computations representing the artificial neural network at times determined by the computational stream.

22. The method of claim 21, wherein the user interaction stream controls the data exchange between the user interaction stream and the computational stream outside of execution of the computations representing the artificial neural network.

23. The method of claim 21, wherein executing the user interaction stream comprises:
controlling setting and editing of computational elements of the computations representing the artificial neural network.

24. The method of claim 21, wherein executing the user interaction stream comprises:
controlling setting and editing of parameters of the computations representing the artificial neural network.

25. The method of claim 21, wherein executing the user interaction stream comprises:
controlling setting and editing of parameters of the inputs to the artificial neural network.

26. The method of claim 21, wherein executing the user interaction stream comprises:
specifying an output to be saved to disk and/or displayed on a screen.

27. The method of claim 21, wherein executing the user interaction stream comprises:
parsing elements to be used in the computations representing the artificial neural network.

28. The method of claim 27, wherein the processing unit comprises a graphics processing unit (GPU) and executing the user interaction stream further comprises:
converting the elements into GPU programs.

29. The method of claim 28, wherein executing the user interaction stream comprises:
compiling the GPU programs.

30. The method of claim 29, wherein executing the user interaction stream comprises:
transferring the GPU programs to the second memory partition.

31. The method of claim 21, further comprising:
executing, by the at least one CPU, a data output stream, the data output stream controlling transfer of outputs of the computations representing the artificial neural network to disk.

32. The method of claim 21, further comprising:
generating the inputs with a video camera during execution of the computations.

33. A system for executing computations representing an artificial neural network, the system comprising:
a camera to acquire input data for the artificial neural network; a first memory partition; a second memory partition; at least one central processing unit (CPU), operably coupled to the camera, the first memory partition, and the second memory partition, to execute a user interaction stream, the user interaction stream controlling transfer of the input data acquired by the camera to the first memory partition and the second memory partition during execution of the computations representing the artificial neural network; a processing unit, operably coupled to the first memory partition, the second memory partition, and the at least one CPU, to execute a computational stream, the computational stream controlling transfer of the input data from the first memory partition and the second memory partition during execution of the computations representing the artificial neural network, the execution of the computations representing the artificial neural network occurring while the camera is acquiring the input data; and a controller, operably coupled to the at least one CPU and the processing unit, to queue user interactions received by the user interaction stream during the execution of the computations representing the artificial neural network for performance at times selected to avoid data corruption.

34. The system of claim 33, wherein the user interactions cause interruption of the computations representing the artificial neural network.

35. The system of claim 33, wherein the user interactions cause a change in inputs to the artificial neural network.

36. The system of claim 33, wherein the user interactions cause a change in display properties of an output of the computations representing the artificial neural network.

37. The system of claim 33, wherein the controller is configured to request the input data during the execution of the computations representing the artificial neural network.

38. The method of claim 21, wherein the user command causes interruption of the computations representing the artificial neural network.

39. The method of claim 21, wherein the user command causes a change in the inputs to the artificial neural network.

40. The method of claim 21, wherein the user command causes a change in display properties of an output of the computations representing the artificial neural network.

41. The method of claim 21, wherein, during execution of the computations representing the artificial neural network, the computational stream controls the data exchange between the user interaction stream and the computational stream by requesting the inputs to the artificial neural network.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.