8000 Integrate SwanLab for offline/online experiment tracking for Accelerate by ShaohonChen · Pull Request #3605 · huggingface/accelerate · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Integrate SwanLab for offline/online experiment tracking for Accelerate #3605

New issue 10000

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 18, 2025

Conversation

ShaohonChen
Copy link
Contributor
@ShaohonChen ShaohonChen commented Jun 3, 2025
Frame 4478

What does this PR do?

This PR introduces SwanLab, a lightweight open-source experiment tracking tool, as a new logging option for the training framework. The integration provides both online and offline tracking capabilities, along with a local dashboard for visualizing results.

SwanLab has previously supported:

  • Tracking via Transformers' report_to parameter (documentation)
  • The Accelerate training framework through external callbacks (documentation)

We've received numerous requests from the community to add native Accelerate support (see here), and we're excited to officially integrate with this excellent project to provide a more seamless experience for developers. This integration is particularly valuable for users in regions with limited network connectivity (such as China), offering them robust experiment tracking capabilities.

Frame 4655

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case. (see here)
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests? (I don't see any tests for any of the callbacks but please let me know if I missed them somewhere. )

Who can review?

@SunMarc I noticed that you recently reviewed some related PRs—would you mind helping review my PR as well? Thank you! (I believe you also reviewed the Hugging Face Transformers integration previously—looking forward to collaborating again! 😄)


Additional information about this PR

Usage guidline

Step 0: Set Up Accelerate and example environment

Following the accelerate official cv example (pet image classification task):

# prepare code and environments
git clone https://github.com/huggingface/accelerate
cd accelerate
pip install -e .
pip install timm     # use in example

Step 1: Set Up SwanLab Online Tracking

Install:

pip install swanlab

To use SwanLab's online tracking, log in to the SwanLab website and obtain your API key from the Settings page. Then, authenticate using the following command:

swanlab login

If you prefer offline mode, skip this step and install local dashboard:

pip install swanlab[dashboard]

Step 2: download Oxford-IIT Pet Dataset used in example code

You can find download link here

Step 3: run offical example script in accelerate projects

python examples/complete_cv_example.py  --data_dir <DOWNLOAD DATA PATH> --with_tracking
Screenshot 2025-06-03 at 21 12 01 Screenshot 2025-06-03 at 21 06 48 Screenshot 2025-06-03 at 21 12 33

visualization demo here

Since my server is offline, I changed the pretrain parameter to false in the create_model code to avoid downloading the model online, which led to very poor accuracy after just 3 epochs 😂.

test suite passes

Screenshot 2025-06-03 at 21 01 04

Since my AI training server couldn't connect to Hugging Face, some tests failed during the automated testing process.😭

Copy link
Member
@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Thanks for adding this ! Can you add a couple of tests like the other tackers ?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ShaohonChen
Copy link
Contributor Author

Hi, we're currently updating the local log storage format for SwanLab, and a new version will be released within 2 days. We'll also complete the corresponding test cases for the new version!

@ShaohonChen ShaohonChen marked this pull request as draft June 4, 2025 04:02
@ShaohonChen ShaohonChen marked this pull request as ready for review June 8, 2025 10:49
@ShaohonChen
Copy link
Contributor Author

I’ve added some new test cases. Would you mind helping me review them when you're available? Thanks a lot for your time! @SunMarc

Copy link
Member
@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot !

@SunMarc SunMarc merged commit 6597dae into huggingface:main Jun 18, 2025
24 of 25 checks passed
@ShaohonChen
Copy link
Contributor Author

Thanks for the review and merge! 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0