AudioDatasource API
Check your docs version
This version of the Anyscale docs is deprecated. Go to the latest version for up to date information.
info
If you want to access this feature, contact the Anyscale team.
AudioDatasource
A Datasource that
reads audio files. Audio formats supported by ffmpeg
are compatible, including .mp3
, .wav
, .flac
, and .ogg
.
info
To use AudioDatasource
, you will need the decord
library. Install it with your preferred
package management system (e.g. pip install --user decord
).
ray.data.read_datasource
ray.data.read_datasource(
datasource: AudioDatasource,
*,
paths: Union[str, List[str]],
sample_rate: Optional[int],
mono_audio: Optional[bool],
) -> Dataset
Read data from audio files in path
into a Ray
Dataset.
Parameters
paths
: A file path or list of file paths to read audio files from.sample_rate
: The sample rate for reading audio; the default value is 44100Hz.mono_audio
: If true, use mono signal to read audio; if false, use the original layout. False by default.
Returns
A Ray Dataset that contains data from audio files as arrays of floats.
Examples
import ray
from ray.anyscale.data import AudioDatasource
audio_uri = [(
"s3://anonymous@air-example-data-2/6G-audio-data-LibriSpeech-train-clean-100-flac"
f"/train-clean-100/5022/29411/5022-29411-{n:04}.flac")
for n in range(10)
]
ds = ray.data.read_datasource(
AudioDatasource(),
paths=audio_uri,
sample_rate=48000,
mono_audio=True,
)