Workspace Snapshots
The Workspace Snapshot feature ensures the persistence of files and folders within your project directory (/home/ray/<your_project_name>
) across restarts. This functionality is designed to maintain project continuity and facilitate seamless transitions between workspace sessions.
Overview
-
Periodic Snapshots: Workspaces automatically take snapshots of files and folders at regular intervals to preserve their state. Snapshots occur every 5 minutes.
-
Persistence Rules: Files within the project directory are persisted across workspace restarts, excluding those specified in
.gitignore
or.anyscaleignore
.
The snapshot captures the entirety of file contents rather than the file differences.
Excluding files with .anyscaleignore
To exclude specific files or folders from workspace snapshots, you can create a file named .anyscaleignore
in the project directory (~/<project-name>
or ~/default
by default) and specify the items to be excluded. The .anyscaleignore
file supports the following patterns to match files and folders:
# .anyscaleignore example
*.txt # Ignore files with a .txt extension in the working directory.
**/*.txt # Ignore files with a .txt extension in ANY directory.
folder/ # Ignore all files under "folder/". The slash at the end is optional.
folder/*.txt # Ignore files with a .txt extension under "folder/".
path/to/filename.py # Ignore a specific file by providing its relative path.
file_[1,2].txt # Ignore file_1.txt and file_2.txt
The .anyscaleignore
file supports a subset of patterns from .gitignore
. However, some patterns like negation and \
escaping are not supported. For further details, you can refer to the official gitignore documentation.
Snapshot Limits
-
Timeout: Snapshots are subject to a timeout period of 4 minutes to ensure that workspaces are not blocked. The backup capacity is calculated based on the 4-minute timeout period. If the snapshotting process exceeds the timeout period, the snapshot is likely to be aborted.
-
Capacity: The snapshot functionality supports backing up approximately 10 GB of data. If the data exceeds the 10 GB limit, there's a risk of potential data loss across workspace restarts. The error banner below will be displayed to bring awareness.
Storage Suggestions: For data exceeding the 10 GB limit, we strongly recommend utilizing alternative storage solutions such as NFS or object storage services like S3. Refer to our document for an in-depth exploration of different storage options suitable for diverse workspace needs.