Ray Data audio API
AudioDatasource
A Datasource that
reads audio files. Audio formats supported by ffmpeg
are compatible, including .mp3, .wav, .flac, and .ogg.
info
To use AudioDatasource, you need the decord library. Install it with your preferred
package management system, such as pip install decord.
ray.data.read_datasource
ray.data.read_datasource(
datasource: AudioDatasource,
*,
paths: Union[str, List[str]],
sample_rate: Optional[int],
mono_audio: Optional[bool],
) -> Dataset
Read data from audio files in path into a Ray
Dataset.
Parameters
paths: A file path or list of file paths to read audio files from.sample_rate: The sample rate for reading audio; the default value is 44100Hz.mono_audio: If true, use mono signal to read audio; if false, use the original layout. False by default.
Returns
A Ray Dataset that contains data from audio files as arrays of floats.
Examples
import ray
from ray.anyscale.data import AudioDatasource
audio_uri = [(
"s3://anonymous@air-example-data-2/6G-audio-data-LibriSpeech-train-clean-100-flac"
f"/train-clean-100/5022/29411/5022-29411-{n:04}.flac")
for n in range(10)
]
ds = ray.data.read_datasource(
AudioDatasource(),
paths=audio_uri,
sample_rate=48000,
mono_audio=True,
)