US11087760B2ActiveUtilityPatentIndex 73

Multimodal transmission of packetized data

Assignee: GOOGLE LLCPriority: Dec 30, 2016Filed: Nov 26, 2019Granted: Aug 10, 2021

Est. expiryDec 30, 2036(~10.5 yrs left)· nominal 20-yr term from priority

Inventors:BHAYA GAURAV STETS ROBERT

H04L 47/25G10L 15/1822G06F 3/167G10L 15/26G10L 2015/088G10L 15/30G10L 15/14G06F 3/165G10L 15/22

PatentIndex Score

Cited by

546

References

Claims

Abstract

A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.

Claims

exact text as granted — not AI-modified

What is claimed is:

1. A system to transmit data in a voice-based computing environment, comprising:
a data processing system comprising one or more processors and memory to:
receive, via an interface of the data processing system, data packets comprising an input audio signal detected by a sensor of a client computing device;
parse the input audio signal to identify a request;
generate, based on the request, a first action data structure;
select a content item responsive to the request;
identify a plurality of interfaces of the client computing device;
determine a characteristic of each of the plurality of interfaces;
select, based on the characteristic of each of the plurality of interfaces, a first interface of the plurality of interfaces having a first characteristic; and
provide the first action data structure and the content item to the client computing device for presentation as audio output via the first interface of the client computing device.

2. The system of claim 1 , comprising:
the data processing system to provide the content item in a modality compatible with the first interface.

3. The system of claim 1 , comprising the data processing system to:
determine a capability of the first interface; and
convert the content item to a modality compatible with the capability of the first interface.

4. The system of claim 1 , wherein the first interface comprises an audio interface, comprising:
the data processing system to provide the content item for presentation via the audio interface.

5. The system of claim 1 , comprising the data processing system to:
select a second content item based on the first characteristic of the first interface; and
provide the second content item to the client computing device for presentation via the first interface.

6. The system of claim 1 , comprising the data processing system to:
parse the input audio signal to identify a keyword corresponding to the request; and
select the content item based at least on the keyword.

7. The system of claim 1 , comprising the data processing system to:
select a second content item; and
provide the second content item to the client computing device for presentation via a second interface of the client computing device that has a different characteristic than the first characteristic.

8. The system of claim 1 , comprising the data processing system to:
select a second content item comprising visual output;
select a second interface comprising a display device based on the second content item comprising visual output; and
provide the second content item to the client computing device for presentation via the second interface of the client computing device.

9. The system of claim 1 , wherein the characteristic of each of the plurality of interfaces comprises a resource utilization value, comprising:
the data processing system to select the first interface based on the resource utilization value to reduce resource utilization associated with presentation of the content item.

10. The system of claim 9 , wherein the resource utilization value comprises at least one of a battery status, a processor utilization, a memory utilization, or a network bandwidth utilization.

11. The system of claim 1 , comprising:
the data processing system to deliver the content item to the client computing device subsequent to transmission of the first action data structure to the client computing device.

12. The system of claim 1 , wherein the plurality of interfaces include at least one of a display screen, an audio interface, a vibration interface, an email interface, a push notification interface, a mobile computing device interface, a portable computing device application, a content slot on an online document, a chat application, mobile computing device application, a laptop, a watch, a virtual reality headset, and a speaker.

13. A method of transmitting data in a voice-based computing environment, comprising:
receiving, by a data processing system comprising one or more processors and memory, data packets comprising an input audio signal detected by a sensor of a client computing device;
parsing, by the data processing system, the input audio signal to identify a request;
generating, by the data processing system based on the request, a first action data structure;
selecting, by the data processing system, a content item responsive to the request;
identifying, by the data processing system, a plurality of interfaces of the client computing device;
determining, by the data processing system, a characteristic of each of the plurality of interfaces;
selecting, by the data processing system based on the characteristic of each of the plurality of interfaces, a first interface of the plurality of interfaces having a first characteristic; and
providing, by the data processing system, the first action data structure and the content item to the client computing device for presentation as audio output via the first interface of the client computing device.

14. The method of claim 13 , comprising:
providing the content item in a modality compatible with the first interface.

15. The method of claim 13 , comprising:
determining a capability of the first interface; and
converting the content item to a modality compatible with the capability of the first interface.

16. The method of claim 13 , wherein the first interface comprises an audio interface, comprising:
providing the content item for presentation via the audio interface.

17. The method of claim 13 , comprising:
selecting a second content item based on the first characteristic of the first interface; and
providing the second content item to the client computing device for presentation via the first interface.

18. The method of claim 13 , comprising:
parsing the input audio signal to identify a keyword corresponding to the request; and
selecting the content item based at least on the keyword.

19. The method of claim 13 , comprising:
selecting a second content item; and
providing the second content item to the client computing device for presentation via a second interface of the client computing device that has a different characteristic than the first characteristic.

20. The method of claim 13 , comprising:
selecting a second content item comprising visual output;
selecting a second interface comprising a display device based on the second content item comprising visual output; and
providing the second content item to the client computing device for presentation via the second interface of the client computing device.

Cited by (0)

No later patents cite this yet.

References (0)

No backward citations on record.