8000 LID-5: Tri-stage learning rate scheduler by Qingzheng-Wang · Pull Request #6159 · espnet/espnet · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

LID-5: Tri-stage learning rate scheduler #6159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Qingzheng-Wang
Copy link
Contributor
@Qingzheng-Wang Qingzheng-Wang commented Jun 19, 2025

What did you change?

Introduced a tri-stage learning rate scheduler (TristageLR) in espnet2/schedulers/tristage_lr.py, inspired by fairseq’s scheduler.

The espnet2/tasks/abs_task.py is changed for the integration of tri-stage learning rate scheduler.


Why did you make this change?

Tri-stage schedulers help stabilize LID model training by supporting warm-up, hold, and decay phases.


Is your PR small enough?

Yes


Additional Context

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 19, 2025
@mergify mergify bot added the ESPnet2 label Jun 19, 2025
@sw005320
Copy link
Contributor

This pull request adds a new tri-stage learning rate scheduler, TristageLR, to the espnet2/schedulers module. The scheduler introduces functionality for warmup, hold, and exponential decay phases, providing a flexible way to adjust learning rates during training.

Addition of the TristageLR scheduler:

  • New class implementation: The TristageLR class is added, inheriting from _LRScheduler and AbsBatchStepScheduler. It supports three phases: warmup, hold, and decay, with customizable ratios and scaling factors.
  • Method definitions:
    • __init__: Initializes the scheduler with parameters like max_steps, warmup_ratio, hold_ratio, and decay_ratio. Computes internal variables such as warmup_steps, hold_steps, and decay_factor.
    • __repr__: Provides a string representation of the scheduler, detailing its configuration.
    • get_lr: Calculates the learning rate for the current step based on the scheduler's phase.

Copy link
Contributor
@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Introduces a new tri-stage learning rate scheduler (TristageLR) that supports configurable warmup, hold, and exponential decay phases for model training stability.

  • Implements TristageLR class mirroring fairseq’s scheduler behavior.
  • Adds parameters for warmup/hold/decay ratios and initial/final LR scaling.
  • Places the new scheduler in espnet2/schedulers for downstream tasks.
Comments suppressed due to low confidence (1)

espnet2/schedulers/tristage_lr.py:11

  • [nitpick] No unit tests are provided for critical phases (warmup, hold, decay). Add tests to verify LR values at boundary steps and default behavior.
class TristageLR(_LRScheduler, AbsBatchStepScheduler):

Copy link
codecov bot commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 32.43243% with 25 lines in your changes missing coverage. Please review.

Project coverage is 57.09%. Comparing base (d3db636) to head (ded1519).
Report is 34 commits behind head on master.

Files with missing lines Patch % Lines
espnet2/schedulers/tristage_lr.py 30.55% 25 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6159      +/-   ##
==========================================
+ Coverage   55.45%   57.09%   +1.63%     
==========================================
  Files         882      886       +4     
  Lines       82812    83725     +913     
==========================================
+ Hits        45927    47801    +1874     
+ Misses      36885    35924     -961     
Flag Coverage Δ
test_integration_espnet2 46.62% <32.43%> (?)
test_integration_espnetez ?
test_python_espnet2 50.50% <32.43%> (-0.73%) ⬇️
test_python_espnetez 12.82% <32.43%> (?)
test_utils 20.63% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ESPnet2 size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0