8000 [Doc] Convert configuring-autoscaling.ipynb back to markdown docs by EagleLo · Pull Request #54111 · ray-project/ray · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Doc] Convert configuring-autoscaling.ipynb back to markdown docs #54111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

EagleLo
Copy link
Contributor
@EagleLo EagleLo commented Jun 25, 2025

Why are these changes needed?

This PR converts the configuring-autoscaling.ipynb notebook back to markdown documentation as part of the effort to remove doctests from documentation and improve maintainability.

Changes made:

  • Converted configuring-autoscaling.ipynb to configuring-autoscaling.md
  • Restored the previous markdown version from git history and updated it for KubeRay v1.4.0
  • Updated worker pod naming format to match KubeRay v1.4.0: raycluster-autoscaler-small-group-worker-xxxxx
  • Added autoscalerOptions.version: v2 configuration for KubeRay >= 1.4.0
  • Updated Autoscaler V2 prerequisites to mention KubeRay v1.4.0
  • Added kind delete cluster to cleanup steps
  • Removed the original notebook file after conversion
  • All quickstart steps have been tested and verified to work with KubeRay v1.4.0 operator

Related issue number

Closes #54077

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/

Manual Testing:
✅ All quickstart steps (1-8) have been manually tested and verified to work

Step 1: Create a Kubernetes cluster with Kind
Screenshot 2025-06-25 at 2 39 35 PM

Step 2: Install the KubeRay operator
Screenshot 2025-06-25 at 2 41 02 PM

Step 3: Create a RayCluster custom resource with autoscaling enabled
Screenshot 2025-06-25 at 2 41 40 PM

Step 4: Verify the Kubernetes cluster status
Screenshot 2025-06-25 at 2 47 50 PM

Step 5: Trigger RayCluster scale-up by creating detached actors
kubectl exec -it $HEAD_POD -- python3 /home/ray/samples/detached_actor.py actor1
kubectl exec -it $HEAD_POD -- python3 /home/ray/samples/detached_actor.py actor2
Screenshot 2025-06-25 at 3 02 55 PM
Screenshot 2025-06-25 at 3 03 08 PM
Screenshot 2025-06-25 at 3 07 15 PM

Step 6: Trigger RayCluster scale-down by terminating detached actors
Screenshot 2025-06-25 at 3 08 47 PM
Screenshot 2025-06-25 at 3 10 50 PM

Step 7: Ray Autoscaler observability
Screenshot 2025-06-25 at 3 13 14 PM

Step 8: Clean up
Screenshot 2025-06-25 at 3 14 38 PM

Screenshot 2025-06-25 at 3 14 45 PM

EagleLo added 3 commits June 25, 2025 14:32
… docs

Signed-off-by: Eagle Lo <eagle.lo@samsara.com>
Signed-off-by: Eagle Lo <eagle.lo@samsara.com>
…to markdown

Signed-off-by: Eagle Lo <eagle.lo@samsara.com>
@EagleLo EagleLo requested review from pcmoritz, kevin85421 and a team as code owners June 25, 2025 22:15
Copy link
Member
@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored the previous markdown version from git history and updated it for KubeRay v1.4.0

This may miss some commits. You can check the commit history here:
https://github.com/ray-project/ray/commits/master/doc/source/cluster/kubernetes/user-guides/configuring-autoscaling.ipynb

Signed-off-by: Eagle Lo <eagle.lo@samsara.com>
Copy link
Member
@MortalHappiness MortalHappiness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rerun all steps in the doc with KubeRay v1.4.0 and correct any wrong information in the doc. Especially check for the resource names. Ping me again when you are done. Thanks!

### Step 3: Create a RayCluster custom resource with autoscaling enabled

```bash
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/v1.3.0/ray-operator/config/samples/ray-cluster.autoscaler.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/v1.3.0/ray-operator/config/samples/ray-cluster.autoscaler.yaml
kubectl apply -f https://raw.githubusercontent.com/ray-project/kuberay/v1.4.0/ray-operator/config/samples/ray-cluster.autoscaler.yaml


# [Example output]
# NAME READY STATUS RESTARTS AGE
# raycluster-autoscaler-head-6zc2t 2/2 Running 0 107s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please manual rerun the whole doc with KubeRay v1.4.0. There are some resource name changes in v1.4.0. For example, the name of the head pod here should be raycluster-autoscaler-head.

Signed-off-by: Eagle Lo <eagle.lo@samsara.com>
@EagleLo
Copy link
Contributor Author
EagleLo commented Jun 27, 2025

@MortalHappiness Hi! I've addressed your feedback and pushed the updates: 1. Fixed resource naming, 2. Updated to v1.4.0 URL, 3.Updated all references.
However, I discovered an issue during testing:
When testing with KubeRay v1.4.0 operator + v1.4.0 YAML, the Ray head container crashes with:
ValueError: Attempting to cap object store memory usage at 1564262 bytes, but the minimum allowed is 78643200 bytes.
This appears to be a memory constraint issue in Kind environments. The v1.4.0 YAML has tighter memory limits that cause Ray to fail in resource-constrained environments.
Can you help take a look at this?

step 1-3:
step 1-3

step 4:
step 4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Docs][KubeRay] Convert configuring-autoscaling.ipynb back to markdown docs
3 participants
0