8000 bug: compaction task stuck for hours · Issue #10209 · risingwavelabs/risingwave · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
bug: compaction task stuck for hours #10209
Closed
@hzxa21

Description

@hzxa21

Describe the bug

Rencently we have seen a small cg3 (MV compaction group) L0->L0 compaction task getting stuck for hours, which cause L0->base level compaction to block for the corresponding compaction group. Some findings:

  • The stuck compaction task contains 60 SSTs with total size=~13MB so it is a relatively small task.
  • Compactor is alive and can handle tasks from both the same and different compaction groups as usual.
  • We saw "Ready to handle compaction task" log but not "Finished compaction task" log. This means compactor indeed received the task but not finished it.
  • The task progress and heartbeat of the stuck task is continously reported to meta node. We use a guard to make sure the task progress is cleared if the task finishes or errors out so that means the task is indeed stuck in the compactor side.
  • Metric "Compacting SSTable Count" reported by meta node stays at 60 for cg3 level0. This means meta indeed didn't receive ReportCompactionTasksRequest from the compactor.

To Reproduce

No response

Expected behavior

No response

Additional context

image
image
image

log.csv

Metadata

Metadata

Labels

type/bugType: Bug. Only for issues.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0