-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Add Azure Files support to persistent storage documentation #54055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Add Azure File Shares to the list of supported shared filesystem options - Update section header to include Azure File Shares - Add commented example showing Azure File Share mount path - Update description text to mention Azure File Shares alongside other options This change documents that Azure File Shares with NFS protocol can be used as persistent storage for Ray Train, similar to AWS EFS and Google Filestore. Closes #54054 Signed-off-by: masoud@anyscale.com <masoud@anyscale.com>
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: elizabethhu13 <elizabeth@anyscale.com>
@@ -29,7 +29,7 @@ Here are some capabilities that persistent storage enables: | |||
and artifacts to share them with others or use them in downstream tasks. | |||
|
|||
|
|||
Cloud storage (AWS S3, Google Cloud Storage) | |||
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we tested this? I'm actually not sure whether azure blob is supported or not here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjyao yes Azure blob has been tested and presigned url was fixed awhile ago (confirmed with Janet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob) | |
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small questions and edits for official names, but generally LGTM
@@ -74,11 +74,13 @@ Use by specifying the shared storage path as the :class:`RunConfig(storage_path) | |||
storage_path="/mnt/cluster_storage", | |||
# HDFS example: | |||
# storage_path=f"hdfs://{hostname}:{port}/subpath", | |||
# Azure File Shares example: | |||
# storage_path="/mnt/azure-fileshare", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering: how is this getting mounted? Sorry if this is an ignorant question, I just don't see any new docs or x-refs that demonstrate this for Azure.
@@ -57,8 +57,8 @@ Ensure that all nodes in the Ray cluster have access to cloud storage, so output | |||
In this example, all files are uploaded to shared storage at ``s3://bucket-name/sub-path/experiment_name`` for further processing. | |||
|
|||
|
|||
Shared filesystem (NFS, HDFS) | |||
----------------------------- | |||
Shared filesystem (NFS, HDFS, Azure File Shares) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting: NFS and HDFS are general, while the Azure File Shares is cloud-specific. Is it also subsumed under NFS technically or is it a different filesystem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point @masoudcharkhabi I am fine here to say either (NFS, HDFS)
OR (AWS EFS, Google Filestore, Azure File Shares, HDFS)
. Azure File Shares (when configured with NFS) is subsumed by NFS
@@ -15,7 +15,7 @@ A Ray Train run produces :ref:`checkpoints <train-checkpointing>` that can be sa | |||
|
|||
**Ray Train expects all workers to be able to write files to the same persistent storage location.** | |||
Therefore, Ray Train requires some form of external persistent storage such as | |||
cloud storage (e.g., S3, GCS) or a shared filesystem (e.g., AWS EFS, Google Filestore, HDFS) | |||
cloud storage (e.g., S3, GCS, Azure Blob) or a shared filesystem (e.g., AWS EFS, Google Filestore, Azure File Shares, HDFS) |
<
8000
svg aria-label="Show options" role="img" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-kebab-horizontal">
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloud storage (e.g., S3, GCS, Azure Blob) or a shared filesystem (e.g., AWS EFS, Google Filestore, Azure File Shares, HDFS) | |
cloud storage (e.g., S3, GCS, Azure Blob Storage) or a shared filesystem (e.g., AWS EFS, Google Filestore, Azure File Shares, HDFS) |
This is the official name. Just wondering: do we require ADLS Gen2, or do we support vanilla blob storage?
@@ -29,7 +29,7 @@ Here are some capabilities that persistent storage enables: | |||
and artifacts to share them with others or use them in downstream tasks. | |||
|
|||
|
|||
Cloud storage (AWS S3, Google Cloud Storage) | |||
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob) | |
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage) |
Why are these changes needed?
The Ray Train persistent storage documentation currently lists AWS EFS and Google Filestore as supported shared filesystem options, 8000 but does not mention Azure File Shares. Azure File Shares support NFS protocol (NFS 4.1) and are commonly used in Azure environments, making them a natural addition to the documentation alongside other cloud filesystem options.
This change documents that Azure File Shares can be used as persistent storage for Ray Train, similar to AWS EFS and Google Filestore.
Related issue number
Closes #54054
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.