Speech Recognition

SkillfulAPI.SpeechRecognition

Overview

The SpeechRecognition function is a practical example of how to interact with the SkillfulAI API to convert spoken audio into text. This functionality can be used to enhance game interaction, implement voice commands, and create dynamic dialogue systems.

Function Example

using SkillfulAI.API;
using System.Collections;
using System.Collections.Generic;
using UnityEngine;

public class SpeechRecognition : MonoBehaviour
{
    /// <summary>
    /// Unity's Update method, called once per frame. Checks for user input to start and stop recording.
    /// </summary>
    public void Update()
    {
        // Start recording when the space key is pressed down
        if (Input.GetKeyDown(KeyCode.Space))
        {
            Debug.Log("Recording...");
            AudioRecorder.StartRecording();
        }

        // Stop recording when the space key is released
        if (Input.GetKeyUp(KeyCode.Space))
        {
            Debug.Log("Stopped!");
            var bytes = AudioRecorder.StopRecordingAndGetBytes();
            Debug.Log("Sending Bytes");
            VoiceToText(bytes);
        }
    }

    /// <summary>
    /// Sends the recorded audio bytes to the SkillfulAPI for speech recognition.
    /// </summary>
    /// <param name="bytes">The recorded audio data in WAV format.</param>
    public void VoiceToText(byte[] bytes)
    {
        SkillfulAPI.SpeechRecognition(bytes, response =>
        {
            Debug.Log(response);
        });
    }
}

Detailed Explanation

Update Method

Description: Unity's Update method is called once per frame. This method checks for user input to start and stop audio recording.

Code:

public void Update()
{
    // Start recording when the space key is pressed down
    if (Input.GetKeyDown(KeyCode.Space))
    {
        Debug.Log("Recording...");
        AudioRecorder.StartRecording();
    }

    // Stop recording when the space key is released
    if (Input.GetKeyUp(KeyCode.Space))
    {
        Debug.Log("Stopped!");
        var bytes = AudioRecorder.StopRecordingAndGetBytes();
        Debug.Log("Sending Bytes");
        VoiceToText(bytes);
    }
}

Functionality:

  • Starts recording audio when the space key is pressed down.

  • Stops recording audio when the space key is released.

  • Converts the recorded audio to bytes and sends it for speech recognition.

VoiceToText Method

Description: The VoiceToText method sends the recorded audio bytes to the SkillfulAI API for speech recognition and logs the result.

Code:

public void VoiceToText(byte[] bytes)
{
    SkillfulAPI.SpeechRecognition(bytes, response =>
    {
        Debug.Log(response);
    });
}

Functionality:

  • Sends the recorded audio bytes to the SkillfulAPI.

  • Logs the response received from the API.

Usage Example

  1. Add the Script to a GameObject:

    • Attach the SpeechRecognition script to any GameObject in your Unity scene.

  2. Run the Scene:

    • Press the space key to start recording audio.

    • Release the space key to stop recording.

    • The recorded audio is sent to the SkillfulAI API for speech recognition, and the recognized text is logged to the console.

Conclusion

The SpeechRecognition function provides a straightforward example of how to leverage the SkillfulAI Gaming SDK to integrate AI-powered speech recognition into your Unity project. This can be expanded and customized to suit more complex scenarios and interactions in your game.

Last updated