8000 Add Azure Files support to persistent storage documentation by masoudcharkhabi · Pull Request #54055 · ray-project/ray · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add Azure Files support to persistent storage documentation #54055

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

masoudcharkhabi
Copy link
Member

Why are these changes needed?

The Ray Train persistent storage documentation currently lists AWS EFS and Google Filestore as supported shared filesystem options, 8000 but does not mention Azure File Shares. Azure File Shares support NFS protocol (NFS 4.1) and are commonly used in Azure environments, making them a natural addition to the documentation alongside other cloud filesystem options.

This change documents that Azure File Shares can be used as persistent storage for Ray Train, similar to AWS EFS and Google Filestore.

Related issue number

Closes #54054

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

- Add Azure File Shares to the list of supported shared filesystem options
- Update section header to include Azure File Shares
- Add commented example showing Azure File Share mount path
- Update description text to mention Azure File Shares alongside other options

This change documents that Azure File Shares with NFS protocol can be used as persistent storage for Ray Train, similar to AWS EFS and Google Filestore.

Closes #54054

Signed-off-by: masoud@anyscale.com <masoud@anyscale.com>
@masoudcharkhabi masoudcharkhabi requested review from a team as code owners June 24, 2025 23:38
@masoudcharkhabi masoudcharkhabi self-assigned this Jun 24, 2025
@masoudcharkhabi masoudcharkhabi added docs An issue or change related to documentation train Ray Train Related Issue azure labels Jun 24, 2025
@masoudcharkhabi masoudcharkhabi requested review from matthewdeng and removed request for jjyao June 25, 2025 01:02
@masoudcharkhabi masoudcharkhabi changed the title Add Azure File Shares support to persistent storage documentation Add Azure Files support to persistent storage documentation Jun 25, 2025
@elizabethhu13
Copy link

LGTM

Copy link
Contributor
@matthewdeng matthewdeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


elizabethhu13 and others added 2 commits June 24, 2025 20:18
@masoudcharkhabi masoudcharkhabi removed the request for review from brucez-anyscale June 25, 2025 03:22
@@ -29,7 +29,7 @@ Here are some capabilities that persistent storage enables:
and artifacts to share them with others or use them in downstream tasks.


Cloud storage (AWS S3, Google Cloud Storage)
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tested this? I'm actually not sure whether azure blob is supported or not here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjyao yes Azure blob has been tested and presigned url was fixed awhile ago (confirmed with Janet)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob)
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage)

Copy link
Contributor
@dstrodtman dstrodtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small questions and edits for official names, but generally LGTM

@@ -74,11 +74,13 @@ Use by specifying the shared storage path as the :class:`RunConfig(storage_path)
storage_path="/mnt/cluster_storage",
# HDFS example:
# storage_path=f"hdfs://{hostname}:{port}/subpath",
# Azure File Shares example:
# storage_path="/mnt/azure-fileshare",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering: how is this getting mounted? Sorry if this is an ignorant question, I just don't see any new docs or x-refs that demonstrate this for Azure.

@@ -57,8 +57,8 @@ Ensure that all nodes in the Ray cluster have access to cloud storage, so output
In this example, all files are uploaded to shared storage at ``s3://bucket-name/sub-path/experiment_name`` for further processing.


Shared filesystem (NFS, HDFS)
-----------------------------
Shared filesystem (NFS, HDFS, Azure File Shares)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting: NFS and HDFS are general, while the Azure File Shares is cloud-specific. Is it also subsumed under NFS technically or is it a different filesystem?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point @masoudcharkhabi I am fine here to say either (NFS, HDFS) OR (AWS EFS, Google Filestore, Azure File Shares, HDFS). Azure File Shares (when configured with NFS) is subsumed by NFS

@@ -15,7 +15,7 @@ A Ray Train run produces :ref:`checkpoints <train-checkpointing>` that can be sa

**Ray Train expects all workers to be able to write files to the same persistent storage location.**
Therefore, Ray Train requires some form of external persistent storage such as
cloud storage (e.g., S3, GCS) or a shared filesystem (e.g., AWS EFS, Google Filestore, HDFS)
cloud storage (e.g., S3, GCS, Azure Blob) or a shared filesystem (e.g., AWS EFS, Google Filestore, Azure File Shares, HDFS)
< 8000 svg aria-label="Show options" role="img" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-kebab-horizontal"> Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cloud storage (e.g., S3, GCS, Azure Blob) or a shared filesystem (e.g., AWS EFS, Google Filestore, Azure File Shares, HDFS)
cloud storage (e.g., S3, GCS, Azure Blob Storage) or a shared filesystem (e.g., AWS EFS, Google Filestore, Azure File Shares, HDFS)

This is the official name. Just wondering: do we require ADLS Gen2, or do we support vanilla blob storage?

@@ -29,7 +29,7 @@ Here are some capabilities that persistent storage enables:
and artifacts to share them with others or use them in downstream tasks.


Cloud storage (AWS S3, Google Cloud Storage)
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob)
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure docs An issue or change related to documentation train Ray Train Related Issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[train] Add Azure Files support to persistent storage documentation
5 participants
0