Skip to main content

Ray Data audio API

AudioDatasource

A Datasource that reads audio files. Audio formats supported by ffmpeg are compatible, including .mp3, .wav, .flac, and .ogg.

info

To use AudioDatasource, you need the decord library. Install it with your preferred package management system, such as pip install decord.


ray.data.read_datasource

ray.data.read_datasource(
datasource: AudioDatasource,
*,
paths: Union[str, List[str]],
sample_rate: Optional[int],
mono_audio: Optional[bool],
) -> Dataset

Read data from audio files in path into a Ray Dataset.

Parameters

  • paths: A file path or list of file paths to read audio files from.
  • sample_rate: The sample rate for reading audio; the default value is 44100Hz.
  • mono_audio: If true, use mono signal to read audio; if false, use the original layout. False by default.

Returns

A Ray Dataset that contains data from audio files as arrays of floats.

Examples

import ray
from ray.anyscale.data import AudioDatasource

audio_uri = [(
"s3://anonymous@air-example-data-2/6G-audio-data-LibriSpeech-train-clean-100-flac"
f"/train-clean-100/5022/29411/5022-29411-{n:04}.flac")
for n in range(10)
]
ds = ray.data.read_datasource(
AudioDatasource(),
paths=audio_uri,
sample_rate=48000,
mono_audio=True,
)