[llm]add bf16 moment adamw #9732

lugimzzz · 2025-01-02T12:59:43Z

PR types

New features

PR changes

Others

Description

使用只需要增加 --optim adamw_16bit_moment

8000

codecov · 2025-01-02T13:34:09Z

Codecov Report

Attention: Patch coverage is 11.64384% with 129 lines in your changes missing coverage. Please review.

Project coverage is 51.03%. Comparing base (8e4ff07) to head (b2761bc).
Report is 367 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/utils/optimizer.py	9.35%	126 Missing ⚠️
paddlenlp/trainer/trainer.py	40.00%	3 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9732      +/-   ##
===========================================
- Coverage    51.41%   51.03%   -0.39%     
===========================================
  Files          745      745              
  Lines       118351   119410    +1059     
===========================================
+ Hits         60856    60939      +83     
- Misses       57495    58471     +976

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

paddle-bot · 2025-01-06T07:48:00Z

Thanks for your contribution!

…bf16

wawltor · 2025-02-25T07:41:36Z

ops/src/paddlenlp_kernel/triton/optimizer/__init__.py

@@ -0,0 +1,15 @@
+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.


2024 -> 2025

wawltor · 2025-02-25T07:43:22Z

ops/src/paddlenlp_kernel/triton/optimizer/adamw_bf16.py

+    # Update param
+    if master_weight_ptr is not None:
+        tl.store(master_weight_ptr + offsets, param, mask=mask)
+        tl.store(param_ptr + offsets, param.to(tl.bfloat16), mask=mask)


这里设计的需要考虑下optimizer原始参数的dtype，是否考虑float16的场景，开源模型部分模型是float16

现在考虑了

wawltor · 2025-02-25T07:46:18Z

ops/src/paddlenlp_kernel/triton/optimizer/adamw_bf16.py

+    BLOCK_SIZE: tl.constexpr,
+):
+    pid = tl.program_id(0)
+    offsets = pid * BLOCK_SIZE + tl.arange(0, BLOCK_SIZE)


这里offsets为啥是arrange的方式

我理解是要读取[0, BLOCK_SIZE 8000 ]所有tensor进行操作

wawltor · 2025-02-25T07:48:50Z

paddlenlp/utils/optimizer.py

@@ -149,3 +154,227 @@ def adamw_python(
        beta1_pow[:], beta2_pow[:] = beta1 * beta1_pow[:], beta2 * beta2_pow[:]
        # 看看怎么更新
        return
+
+
+class AdamWPython(AdamW):


这个名字是不是有点奇怪，是不是朴素实现，或者 AdamWSlow之类的更合适点

改成AdamWCustom

wawltor · 2025-02-25T08:01:08Z

paddlenlp/utils/optimizer.py

+            type = core.VarDesc.VarType.DENSE_TENSOR
+        except:
+            type = core.VarDesc.VarType.LOD_TENSOR
+        self._add_accumulator(


论文中的beta1和beta2是float32是吗？

wawltor

LGTM

add adam

a68b559

lugimzzz added 7 commits February 18, 2025 20:49

add triton ops

ccca5f0

update model

c415fdd

Merge branch 'develop' of https://github.com/lugimzzz/PaddleNLP into …

e0064ad

…bf16

add

e4197d8

add triton ops

9ad1e6c

fix

2b044fc

fix

71cb974

lugimzzz changed the title ~~[llm]add adam~~ [llm]add bf16 adamw Feb 20, 2025

lugimzzz changed the title ~~[llm]add bf16 adamw~~ [llm]add bf16 moment adamw Feb 20, 2025

wawltor reviewed Feb 25, 2025

View reviewed changes

lugimzzz added 2 commits February 27, 2025 19:40

support 16bit

6a1437a

support 16bit

b2761bc

wawltor approved these changes Mar 3, 2025

View reviewed changes

wawltor merged commit 4b65eec into PaddlePaddle:develop Mar 3, 2025
9 of 12 checks passed

lugimzzz deleted the bf16 branch March 3, 2025 09:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llm]add bf16 moment adamw #9732

[llm]add bf16 moment adamw #9732

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		@@ -0,0 +1,15 @@
		# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

[llm]add bf16 moment adamw #9732

[llm]add bf16 moment adamw #9732

Uh oh!

Conversation

Uh oh!

PR types

PR changes

Description

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!