The AudioRecorder class is a static utility for recording audio from the microphone and encoding it to WAV format. This class is useful for capturing audio data in Unity, which can be processed further or used for various purposes such as speech recognition or audio analysis.
Overview
The AudioRecorder class allows you to start and stop audio recording, and provides the recorded audio data as a byte array in WAV format. It includes methods for handling audio recording setup, retrieval, and encoding.
Class Definition
using System;
using System.IO;
using UnityEngine;
namespace SkillfulAI.API
{
/// <summary>
/// A static class for recording audio using the microphone and encoding it to WAV format.
/// </summary>
public static class AudioRecorder
{
private static AudioClip clip;
private static bool recording;
/// <summary>
/// Indicates whether audio is currently being recorded.
/// </summary>
public static bool IsRecording => recording;
/// <summary>
/// Starts recording audio from the microphone.
/// </summary>
/// <param name="duration">The duration of the recording in seconds (default is 10 seconds).</param>
/// <param name="frequency">The sample rate of the recording in Hz (default is 44100 Hz).</param>
public static void StartRecording(int duration = 10, int frequency = 44100)
{
clip = Microphone.Start(null, false, duration, frequency);
recording = true;
}
/// <summary>
/// Stops the recording and returns the audio data as a byte array in WAV format.
/// </summary>
/// <returns>A byte array containing the recorded audio in WAV format, or null if not recording.</returns>
public static byte[] StopRecordingAndGetBytes()
{
if (!recording) return null;
var position = Microphone.GetPosition(null);
Microphone.End(null);
var samples = new float[position * clip.channels];
clip.GetData(samples, 0);
recording = false;
return EncodeAsWAV(samples, clip.frequency, clip.channels);
}
/// <summary>
/// Encodes the provided audio samples to WAV format.
/// </summary>
/// <param name="samples">An array of audio samples.</param>
/// <param name="frequency">The sample rate of the audio in Hz.</param>
/// <param name="channels">The number of audio channels.</param>
/// <returns>A byte array containing the audio data in WAV format.</returns>
private static byte[] EncodeAsWAV(float[] samples, int frequency, int channels)
{
using (var memoryStream = new MemoryStream(44 + samples.Length * 2))
{
using (var writer = new BinaryWriter(memoryStream))
{
writer.Write("RIFF".ToCharArray());
writer.Write(36 + samples.Length * 2);
writer.Write("WAVE".ToCharArray());
writer.Write("fmt ".ToCharArray());
writer.Write(16);
writer.Write((ushort)1);
writer.Write((ushort)channels);
writer.Write(frequency);
writer.Write(frequency * channels * 2);
writer.Write((ushort)(channels * 2));
writer.Write((ushort)16);
writer.Write("data".ToCharArray());
writer.Write(samples.Length * 2);
foreach (var sample in samples)
{
writer.Write((short)(sample * short.MaxValue));
}
}
return memoryStream.ToArray();
}
}
}
}
Detailed Explanation
IsRecording Property
Description: Indicates whether the audio is currently being recorded.
Code:
public static bool IsRecording => recording;
Functionality:
Returns a boolean value indicating the recording status.
StartRecording Method
Description: Starts recording audio from the microphone with a specified duration and sample rate.
Code:
public static void StartRecording(int duration = 10, int frequency = 44100)
{
clip = Microphone.Start(null, false, duration, frequency);
recording = true;
}
Parameters:
duration: The duration of the recording in seconds (default is 10 seconds).
frequency: The sample rate of the recording in Hz (default is 44100 Hz).
Functionality:
Begins recording audio from the default microphone.
Sets the recording status to true.
StopRecordingAndGetBytes Method
Description: Stops the recording and retrieves the audio data as a byte array in WAV format.
Code:
public static byte[] StopRecordingAndGetBytes()
{
if (!recording) return null;
var position = Microphone.GetPosition(null);
Microphone.End(null);
var samples = new float[position * clip.channels];
clip.GetData(samples, 0);
recording = false;
return EncodeAsWAV(samples, clip.frequency, clip.channels);
}
Returns:
A byte array containing the recorded audio in WAV format, or null if no recording is in progress.
Functionality:
Ends the recording session and retrieves the recorded audio data.
Encodes the audio data to WAV format using the EncodeAsWAV method.
EncodeAsWAV Method
Description: Encodes an array of audio samples to WAV format.
Code:
private static byte[] EncodeAsWAV(float[] samples, int frequency, int channels)
{
using (var memoryStream = new MemoryStream(44 + samples.Length * 2))
{
using (var writer = new BinaryWriter(memoryStream))
{
writer.Write("RIFF".ToCharArray());
writer.Write(36 + samples.Length * 2);
writer.Write("WAVE".ToCharArray());
writer.Write("fmt ".ToCharArray());
writer.Write(16);
writer.Write((ushort)1);
writer.Write((ushort)channels);
writer.Write(frequency);
writer.Write(frequency * channels * 2);
writer.Write((ushort)(channels * 2));
writer.Write((ushort)16);
writer.Write("data".ToCharArray());
writer.Write(samples.Length * 2);
foreach (var sample in samples)
{
writer.Write((short)(sample * short.MaxValue));
}
}
return memoryStream.ToArray();
}
}
Parameters:
samples: An array of audio samples.
frequency: The sample rate of the audio in Hz.
channels: The number of audio channels.
Returns:
A byte array containing the audio data in WAV format.
Functionality:
Writes the audio data and WAV header to a MemoryStream.
Returns the WAV data as a byte array.
Usage Example
Add the Script to a GameObject:
Since AudioRecorder is a static class, you don’t need to attach it to a GameObject. Simply call its methods from other scripts.
Start Recording:
Invoke AudioRecorder.StartRecording() to begin recording.
Stop Recording and Get Bytes:
Invoke AudioRecorder.StopRecordingAndGetBytes() to stop recording and obtain the recorded audio data.
Conclusion
The AudioRecorder class provides a complete solution for capturing audio from the microphone and encoding it to WAV format. This utility can be integrated into your Unity project for various audio processing tasks.