Skip to content

Conversation

@dill21yu
Copy link
Contributor

Purpose of the pull request

close #17670

Brief change log

Feature Enhancement
Added disk usage monitoring for the data.basedir.path directory
Added dataBasedirPathDiskUsagePercentage field in Worker heartbeat data
Added display of dataBasedir disk usage on the frontend monitoring page
Added internationalization support (Chinese and English)
Implemented load protection based on disk usage of the data.basedir.path directory
Added maxDataBasedirDiskUsagePercentageThresholds configuration item in BaseServerLoadProtectionConfig
Implemented disk usage check logic for the dataBasedir path in BaseServerLoadProtection
Added max-data-basedir-disk-usage-percentage-thresholds configuration option in Worker config files
Configuration Updates
Kubernetes Deployment Configuration
Added description of the environment variable WORKER_SERVER_LOAD_PROTECTION_MAX_DATA_BASEDIR_DISK_USAGE_PERCENTAGE_THRESHOLDS in README.md
Added corresponding configuration items in values.yaml
Docker Deployment Configuration
Added WORKER_SERVER_LOAD_PROTECTION_MAX_DATA_BASEDIR_DISK_USAGE_PERCENTAGE_THRESHOLDS configuration in all test docker-compose.yaml files
UI Improvements
Adjusted layout of the Worker monitoring page
Added data directory disk usage metric; increased number of icons per row from 4 to 5, ensuring all monitoring metrics are displayed on the same line
These changes enhance DolphinScheduler's disk monitoring capabilities by providing fine-grained monitoring and overload protection for the data.basedir.path directory, helping prevent service issues caused by insufficient disk space.

Verify this pull request

This pull request is already covered by existing tests, such as WorkerServerLoadProtectionTest.

Pull Request Notice

Pull Request Notice

If your pull request contains incompatible change, you should also add it to docs/docs/en/guide/upgrade/incompatible.md

@SbloodyS SbloodyS added this to the 3.4.0 milestone Nov 14, 2025
yud8 added 2 commits November 17, 2025 18:07
…asedir.path directory (apache#17670)Add data basedir disk usage threshold but set to 1.0 (100%) to effectively disable the check
yud8 added 2 commits November 18, 2025 14:12
…asedir.path directory (apache#17670) Fix field shadowing and improve code clarity
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
2.7% Coverage on New Code (required ≥ 60%)

See analysis details on SonarQube Cloud

yud8 and others added 8 commits November 18, 2025 16:20
…asedir.path directory (apache#17670) fix :MasterServerLoadProtectionConfig
…asedir.path directory (apache#17670) move disk monitoring to workers only

- Remove master disk checks, keep only for workers
- Clean up related configs and constructors
- Fix config reference in WorkerServerLoadProtection
@dill21yu
Copy link
Contributor Author

Hi @SbloodyS @ruanwenjun @EricGao888
I've addressed the CI issues reported earlier (Updated docker-compose.yaml configurations for the new WORKER_SERVER_LOAD_PROTECTION_MAX_DATA_BASEDIR_DISK_USAGE_PERCENTAGE_THRESHOLDS setting, Resolved code style and CodeQL warnings: field masks in config classes).
Could you please review the changes and let me know if anything else needs adjustment? Thanks!

@dill21yu
Copy link
Contributor Author

Hi @SbloodyS @ruanwenjun @EricGao888 I've addressed the CI issues reported earlier (Updated docker-compose.yaml configurations for the new WORKER_SERVER_LOAD_PROTECTION_MAX_DATA_BASEDIR_DISK_USAGE_PERCENTAGE_THRESHOLDS setting, Resolved code style and CodeQL warnings: field masks in config classes). Could you please review the changes and let me know if anything else needs adjustment? Thanks!

The newest build error (No plugin found for prefix 'sonar') is from a missing SonarQube plugin in CI—unrelated to my PR changes.

Copy link
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding max-data-basedir-disk-usage-percentage-thresholds will conflict with the current max-disk-usage-percentage-thresholds, which will make it more difficult for users to understand.

I think we should configure multiple directories in the following two ways
1.

max-disk-usage-percentage-thresholds:
  /data1: 0.8
  /data2: 0.9
max-disk-usage-percentage-thresholds:
  path: /data1,/data2
  percentage: 0.9

This needs to be discussed. cc @ruanwenjun @zhongjiajie @Gallardot

@SbloodyS SbloodyS added the discussion discussion label Nov 25, 2025
@dill21yu
Copy link
Contributor Author

dill21yu commented Dec 2, 2025

Adding max-data-basedir-disk-usage-percentage-thresholds will conflict with the current max-disk-usage-percentage-thresholds, which will make it more difficult for users to understand.

I think we should configure multiple directories in the following two ways 1.

max-disk-usage-percentage-thresholds:
  /data1: 0.8
  /data2: 0.9
max-disk-usage-percentage-thresholds:
  path: /data1,/data2
  percentage: 0.9

This needs to be discussed. cc @ruanwenjun @zhongjiajie @Gallardot

Thank you for your suggestion! I understand your concerns about potential configuration conflicts. To maintain backward compatibility and reduce the burden on users to manually specify the Worker’s deployment directory, would the following approach work?

server-load-protection:
max-disk-usage-percentage-thresholds: 0.8 # Continue monitoring the Worker's deployment directory (backward compatible)
additional-disk-paths: # Optional: monitor additional directories
//tmp/dolphinscheduler: 0.9
/var/log: 0.85
Benefits of This Approach
Full backward compatibility: Existing configurations like max-disk-usage-percentage-thresholds: 0.8 will keep working as before, automatically applying to the Worker’s deployment directory.
User-friendly: Users don’t need to know or configure the exact deployment path—the system handles it automatically.
No frontend changes required: The UI can continue displaying disk usage for the Worker’s deployment directory without modification.Avoid overcomplicating the UI.
Extensible: When needed, users can optionally define additional paths to monitor via additional-disk-paths.

What do you think of this proposal? @SbloodyS @ruanwenjun @zhongjiajie @Gallardot

@ruanwenjun
Copy link
Member

ruanwenjun commented Dec 10, 2025

@dill21yu It’s preferable to retain the existing configuration key max-disk-usage-percentage-thresholds, but mark it as deprecated in the documentation.

Introduce a new configuration:

max-disk-usage-percentage-thresholds-rules:
  - disk-path: /dev1
    usage-percentage-thresholds: 0.9
  - disk-path: /dev2
    usage-percentage-thresholds: 0.8

When the old configuration max-disk-usage-percentage-thresholds is used, we should log a warning indicating that it is deprecated and recommend switching to the new max-disk-usage-percentage-thresholds-rules configuration.

@SbloodyS SbloodyS modified the milestones: 3.4.0, 3.4.1 Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement][Worker-monitoring] Add disk usage monitoring for data.basedir.path directory

3 participants