Freshub, Inc. et al v. Amazon.Com Inc. et al

Filing 1

COMPLAINT ( Filing fee $ 400 receipt number 0542-12267080), filed by Freshub, Inc., Freshub, Ltd.. (Attachments: # 1 Exhibit 1, # 2 Exhibit 2, # 3 Exhibit 3, # 4 Exhibit 4, # 5 Exhibit 5, # 6 Exhibit 6, # 7 Exhibit 7, # 8 Exhibit 8, # 9 Exhibit 9, # 10 Exhibit 10, # 11 Exhibit 11, # 12 Exhibit 12, # 13 Exhibit 13, # 14 Exhibit 14, # 15 Exhibit 15, # 16 Exhibit 16, # 17 Exhibit 17, # 18 Exhibit 18, # 19 Exhibit 19, # 20 Exhibit 20, # 21 Exhibit 21, # 22 Exhibit 22, # 23 Exhibit 23, # 24 Exhibit 24, # 25 Exhibit 25, # 26 Exhibit 26, # 27 Exhibit 27, # 28 Exhibit 28, # 29 Exhibit 29, # 30 Exhibit 30, # 31 Exhibit 31, # 32 Exhibit 32, # 33 Exhibit 33, # 34 Exhibit 34, # 35 Exhibit 35, # 36 Exhibit 36, # 37 Exhibit 37, # 38 Exhibit 38, # 39 Exhibit 39, # 40 Civil Cover Sheet)(Palmer, John)

Download PDF
EXHIBIT 29 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service SpeechRecognizer Interface version 2.0 Table of Contents Version Changes State Diagram Capabilities API SpeechRecognizer Context Recognize Event StopCapture Directive ExpectSpeech Directive ExpectSpeechTimedOut Event Every user utterance leverages SpeechRecognizer. It is the core interface of the Alexa Voice Service (AVS). It exposes directives and events for capturing user speech and prompting a client when Alexa needs additional speech input. Additionally, this interface allows your client to inform AVS of how an interaction with Alexa was initiated (press and hold, tap and release, voice-initiated/wake word enabled (/docs/alexa-voice-service/audio-hardware-configurations.html#applications)), and choose the appropriate Automatic Speech Recognition (ASR) profile (/docs/alexa-voiceservice/audio-hardware-configurations.html#asr) for your product, which allows Alexa to understand user speech and respond with precision.  Important: Cloud-based wake word verification is required for voice-initiated products. It improves wake word accuracy by reducing false wakes that are caused by utterances that sound similar to the wake word. See Enable Cloud-based Wake Word Verification (/docs/alexa-voice-service/enable-cloud-based-wake-word-verification.html) for implementation details. Version Changes Opus (http://opus-codec.org/) is now a supported format for captured audio. For more details, see the specification under the Recognize event. State Diagram The following diagram illustrates state changes driven by SpeechRecognizer components. Boxes represent SpeechRecognizer states and the connectors indicate state transitions. SpeechRecognizer has the following states: (http IDLE: Prior to capturing user speech, SpeechRecognizer should be in an idle state. SpeechRecognizer should also return to an idle state after a speech interaction with AVS has concluded. This can occur when a speech request has been successfully processed or when an ExpectSpeechTimedOut event has elapsed. https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html 1/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service Additionally, SpeechRecognizer may return to an idle state during a multiturn interaction, at which point, if additional speech is required from the user, it should transition from the idle state to the expecting speech state without a user starting a new interaction. RECOGNIZING: When a user begins interacting with your client, specifically when captured audio is streamed to AVS, SpeechRecognizer should transition from the idle state to the recognizing state. It should remain in the recognizing state until the client stops recording speech (or streaming is complete), at which point your SpeechRecognizer component should transition from the recognizing state to the busy state. BUSY: While processing the speech request, SpeechRecognizer should be in the busy state. You cannot start another speech request until the component transitions out of the busy state. From the busy state, SpeechRecognizer will transition to the idle state if the request is successfully processed (completed) or to the expecting speech state if Alexa requires additional speech input from the user. EXPECTING SPEECH: SpeechRecognizer should be in the expecting speech state when additional audio input is required from a user. From expecting speech, SpeechRecognizer should transition to the recognizing state when a user interaction occurs or the interaction is automatically started on the user's behalf. It should transition to the idle state if no user interaction is detected within the specified timeout window. (https://images-na.ssl-images-amazon.com/images/G/01/mobile-apps/dex/alexa/alexavoice-service/docs/speechrecognizer-state.png) Click to enlarge Capabilities API To use version 2.0 of the SpeechRecognizer interface, it must be declared in your call to the Capabilities API. For additional details, see Capabilities API (../alexa-voice-service/capabilitiesapi.html). Sample Object { "type": "AlexaInterface", "interface": "SpeechRecognizer", "version": "2.0" } SpeechRecognizer Context (http Alexa expects all clients to report the currently set wake word, if wake word enabled. https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html 2/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service To learn more about reporting Context, see Context Overview (../alexa-voiceservice/context.html). Sample Message { "header": { "namespace": "SpeechRecognizer", "name": "RecognizerState" }, "payload": { "wakeword": "ALEXA" } } Payload Parameters Parameter Description Type wakeword Identifies the current wake word. Accepted Value: "ALEXA" string Recognize Event The Recognize event is used to send user speech to AVS and translate that speech into one or more directives. This event must be sent as a multipart message, consisting of two parts: A JSON-formatted object The binary audio captured by the product's microphone. Captured audio that is streamed to AVS should be chunked to reduce latency. The stream should contain 10ms of captured audio per chunk (320 bytes). After an interaction with Alexa is initiated, the microphone must remain open until: A StopCapture directive is received. The stream is closed by the Alexa service. The user manually closes the microphone. For example, a press and hold implementation (/docs/alexa-voice-service/audio-hardware-configurations.html#applications). The profile parameter and initiator object tell Alexa which ASR profile should be used to best understand the captured audio, and how the interaction was initiated. All captured audio must be sent to AVS in either PCM or Opus, and adhere to the following specifications: PCM Opus 16bit Linear PCM 16bit Opus 16kHz sample rate 16kHz sample rate Single channel 32k bit rate Little endian byte order Little endian byte order (http  Important: If your product is voice-initiated it must adhere to the Requirements for Cloud-Based Wake Word Verification (/docs/alexa-voice-service/streaming-requirementsfor-cloud-based-wake-word-verification.html). https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html 3/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service For a protocol specific example, see Structuring an HTTP/2 Request (/docs/alexa-voiceservice/structure-http2-request.html#examples). Sample Message { "context": [ // This is an array of context objects that are used to communicate the // state of all client components to Alexa. See Context for details. ], "event": { "header": { "namespace": "SpeechRecognizer", "name": "Recognize", "messageId": "{{STRING}}", "dialogRequestId": "{{STRING}}" }, "payload": { "profile": "{{STRING}}", "format": "{{STRING}}", "initiator": { "type": "{{STRING}}", "payload": { "wakeWordIndices": { "startIndexInSamples": {{LONG}}, "endIndexInSamples": {{LONG}} }, "token": "{{STRING}}" } } } } } Binary Audio Attachment Each Recognize event requires a corresponding binary audio attachment as one part of the multipart message. The following headers are required for each binary audio attachment: Content-Disposition: form-data; name="audio" Content-Type: application/octet-stream {{BINARY AUDIO ATTACHMENT}} Context This event requires your product to report the status of all client component states to Alexa in the context object. For additional information see Context (/docs/alexa-voiceservice/context.html). Header Parameters Parameter Description Type messageId A unique ID used to represent a specific message. string dialogRequestId A unique identifier that your client must create for each Recognize event sent to Alexa. This parameter is used to string correlate directives sent in response to a specific Recognize event. Payload Parameters Parameter Description https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html Type (http 4/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service Parameter Description Type profile Identifies the Automatic Speech Recognition (ASR) profile associated with your product. AVS supports three distinct string ASR profiles optimized for user speech from varying distances. Accepted values: CLOSE_TALK , NEAR_FIELD , FAR_FIELD . format Identifies the format of captured audio. string Accepted value: AUDIO_L16_RATE_16000_CHANNELS_1 (PCM), OPUS . initiator Lets Alexa know how an interaction was initiated. object This object is required when an interaction is originated by the end user (wake word, tap, push and hold). If initiator is present in an ExpectSpeech directive then it must be returned in the following Recognize event. If initiator is absent from the ExpectSpeech directive, then it should not be included in the following Recognize event. initiator.type Represents the action taken by a user to initiate an interaction with Alexa. string Accepted values: PRESS_AND_HOLD , TAP , and WAKEWORD . If an initiator.type is provided in an ExpectSpeech directive, that string must be returned as initiator.type in the following Recognize event. initiator.payload Includes information about the initiator. object initiator.payload.wak This object is required when initiator.type is set to object eWordIndices WAKEWORD . wakeWordIndices includes the startIndexInSamples and endIndexInSamples . For additional details, see Requirements for Cloud-Based Wake Word Verification (/docs/alexa-voice-service/streaming-requirements-forcloud-based-wake-word-verification.html). initiator.payload.wak Represents the index in the audio stream where the wake eWordIndices.startIn dexInSamples word starts (in samples). The start index should be accurate to within 50ms of wake word detection. initiator.payload.wak Represents the index in the audio stream where the wake eWordIndices.endInd exInSamples word ends (in samples). The end index should be accurate to within 150ms of the end of the detected wake word. initiator.payload.tok en An opaque string. This value is only required if present in the payload of a preceding ExpectSpeech (/docs/alexa- long long string voice-service/speechrecognizer.html#expectspeech) directive. Profiles ASR profiles are tuned for different products, form factors, acoustic environments and use cases. Use the table below to learn more about accepted values for the profile parameter. Value Optimal Listening Distance CLOSE_TALK 0 to 2.5 ft. NEAR_FIELD 0 to 5 ft. FAR_FIELD (http 0 to 20+ ft. https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html 5/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service  Note: See Audio Hardware Configurations (/docs/alexa-voice-service/audio-hardwareconfigurations.html) to determine the appropriate ASR Profile for your Alexa-enabled product. Initiator The initiator parameter tells AVS how an interaction with Alexa was triggered; and determines two things: 1. If StopCapture will be sent to your client when the end of speech is detected in the cloud. 2. If cloud-based wake word verification will be performed on the stream. initiator must be included in the payload of each SpeechRecognizer.Recognize event. The following values are accepted: Value Description Supported Profile(s) StopCaptur e Enabled Wake Word Verification Enabled Wake Word Indices Required PRESS_AND_ HOLD Audio stream initiated by CLOSE_TALK N N N Audio stream NEAR_FIELD , Y N N initiated by FAR_FIELD Y Y Y pressing a button (physical or GUI) and terminated by releasing it. TAP the tap and release of a button (physical or GUI) and terminated when a StopCapture   directive is received. WAKEWORD Audio stream NEAR_FIELD , initiated by the use of a wake word FAR_FIELD and terminated when a StopCapture   directive is received. StopCapture Directive (http This directive instructs your client to stop capturing a user’s speech after AVS has identified the user’s intent or when end of speech is detected. When this directive is received, your client must immediately close the microphone and stop listening for the user’s speech. https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html 6/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service  Note: StopCapture is sent to your client on the downchannel stream and may be received while speech is still being streamed to AVS. To receive the StopCapture directive, you must use a profile in your Recognize event that supports cloud-endpointing, such as NEAR_FIELD or FAR_FIELD . Sample Message { "directive": { "header": { "namespace": "SpeechRecognizer", "name": "StopCapture", "messageId": "{{STRING}}", "dialogRequestId": "{{STRING}}" }, "payload": { } } } Header Parameters Parameter Description Type messageId A unique ID used to represent a specific message. string dialogRequestId A unique ID used to correlate directives sent in response to string a specific Recognize event. ExpectSpeech Directive ExpectSpeech is sent when Alexa requires additional information to fulfill a user's request. It instructs your client to open the microphone and begin streaming user speech. If the microphone is not opened within the specified timeout window, an ExpectSpeechTimedOut event must be sent from your client to AVS. During a multi-turn interaction with Alexa, your device will receive at least one ExpectSpeech directive instructing your client to start listening for user speech. If present, the initiator object included in the payload of the ExpectSpeech directive must be passed back to Alexa as the initiator object in the following Recognize event. If initiator is absent from the payload, the following Recognize event should not include initiator . For information on the rules that govern audio prioritization, please review the Interaction Model (/docs/alexa-voice-service/interaction-model.html). Sample Message (http https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html 7/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service { "directive": { "header": { "namespace": "SpeechRecognizer", "name": "ExpectSpeech", "messageId": "{{STRING}}", "dialogRequestId": "{{STRING}}" }, "payload": { "timeoutInMilliseconds": {{LONG}}, "initiator": { "type": "{{STRING}}", "payload": { "token": "{{STRING}}" } } } } } Header Parameters Parameter Description Type messageId A unique ID used to represent a specific message. string dialogRequestId A unique ID used to correlate directives sent in response to a specific Recognize event. string Parameter Description Type timeoutInMillisecond s Specifies, in milliseconds, how long your client should wait for the microphone to open and begin streaming user long Payload Parameters speech to AVS. If the microphone is not opened within the specified timeout window, then the ExpectSpeechTimedOut event must be sent. The primary use case for this behavior is a PRESS_AND_HOLD implementation. initiator Contains information about the interaction. If present it must be sent back to Alexa in the following Recognize object event. initiator.type An opaque string. If present it must be sent back to Alexa in the following Recognize event. string initiator.payload Includes information about the initiator. object initiator.payload.tok en An opaque string. If present it must be sent back to Alexa in the following Recognize event. string ExpectSpeechTimedOut Event This event must be sent to AVS if an ExpectSpeech directive was received, but was not satisfied within the specified timeout window. Sample Message https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html (http 8/9 6/6/2019 SpeechRecognizer Interface | Alexa Voice Service { "event": { "header": { "namespace": "SpeechRecognizer", "name": "ExpectSpeechTimedOut", "messageId": "{{STRING}}", }, "payload": { } } } Header Parameters Parameter Description Type messageId A unique ID used to represent a specific message. string Payload Parameters An empty payload should be sent. (http https://developer.amazon.com/docs/alexa-voice-service/speechrecognizer.html 9/9

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?