-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Replace Sequence Parallel to Paddle Sequence Parallel #7966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your contribution! |
d751a66
to
2c693cc
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #7966 +/- ##
===========================================
+ Coverage 55.37% 55.47% +0.10%
===========================================
Files 596 595 -1
Lines 91622 91419 -203
===========================================
- Hits 50732 50715 -17
+ Misses 40890 40704 -186 ☔ View full report in Codecov by Sentry. |
# and sp_configs.sp_fused_linear_param_grad_add | ||
# ) | ||
|
||
self.sp_asyn_reduce_scatter = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为了测试,这里先手动写死
@@ -29,7 +29,7 @@ | |||
from .feature_extraction_utils import BatchFeature, FeatureExtractionMixin | |||
from .image_processing_utils import ImageProcessingMixin | |||
from .attention_utils import create_bigbird_rand_mask_idx_list | |||
from .sequence_parallel_utils import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要paddle的版本是?2.6.0 不行吧?
这里需要加 try except吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Others
Description
本PR用于将PaddleNLP中的SP并行统一替换为Paddle框架提供的SP并行的代码