Ray Data audio API
AudioDatasource
A Datasource
that
reads audio files. Audio formats supported by ffmpeg
are compatible, including .mp3
, .wav
, .flac
, and .ogg
.
info
To use AudioDatasource
, you need the decord
library. Install it with your preferred
package management system, such as pip install decord
.
ray.data.read_datasource
ray.data.read_datasource(
datasource: AudioDatasource,
*,
paths: Union[str, List[str]],
sample_rate: Optional[int],
mono_audio: Optional[bool],
) -> Dataset
Read data from audio files in path
into a Ray
Dataset.
Parameters
paths
: A file path or list of file paths to read audio files from.sample_rate
: The sample rate for reading audio; the default value is 44100Hz.mono_audio
: If true, use mono signal to read audio; if false, use the original layout. False by default.
Returns
A Ray Dataset that contains data from audio files as arrays of floats.
Examples
import ray
from ray.anyscale.data import AudioDatasource
audio_uri = [(
"s3://anonymous@air-example-data-2/6G-audio-data-LibriSpeech-train-clean-100-flac"
f"/train-clean-100/5022/29411/5022-29411-{n:04}.flac")
for n in range(10)
]
ds = ray.data.read_datasource(
AudioDatasource(),
paths=audio_uri,
sample_rate=48000,
mono_audio=True,
)