-
Notifications
You must be signed in to change notification settings - Fork 9k
HDFS-17049. EC: Fix duplicate block group IDs generated by SequentialBlockGroupIdGenerator. #5743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you accou 8000 nt related emails.
Already on GitHub? Sign in to your account
Conversation
…BlockGroupIdGenerator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch here. leave some comments inline. JFYI.
@@ -51,7 +51,7 @@ public class SequentialBlockGroupIdGenerator extends SequentialNumber { | |||
} | |||
|
|||
@Override // NumberGenerator | |||
public long nextValue() { | |||
synchronized public long nextValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synchronized public long nextValue()
-> public synchronized long nextValue()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
for (Thread t : threads) { | ||
t.join(); | ||
} | ||
Set<Long> allBlockIds = new HashSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here will be more readable?
Set<Long> allBlockIds = new HashSet<>();
for (List<Long> set : blockIds) {
for (long id : set) {
if (!allBlockIds.add(id)) {
fail("Same block group id is generated!");
}
}
}
Any more simple check way? Such as using HashMap
directly?
There was a problem hiding this comment.
C 8000 hoose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm not understand how to use HashMap
to get a more simple implement. I just modified the code as you did above.
@zhangshuyan0 Thank you very much for your valuable time. I would like to inquire about the concurrency issue in SequentialBlockGroupIdGenerator#nextValue. It seems that FSNamesystem#nextBlockId verifies FSNamesystemLock#writeLock. Block createNewBlock(BlockType blockType) throws IOException {
assert hasWriteLock();
Block b = new Block(nextBlockId(blockType), 0, 0);
// Increment the generation stamp for every new block.
b.setGenerationStamp(nextGenerationStamp(false));
return b;
} |
🎊 +1 overall
This message was automatically generated. |
@zhtttylz Thanks for your review. I met this problem because in our production code the directory locks have finer granularity, which means different files may obtain different locks. It looks that trunk code does not have this problem. Thanks again. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Quite surprised to see how this can happen, we have write lock while getting nextBlockID
|
cc @umamaheswararao FYI. |
Back to this PR. Considering that blockId/blockGroupId generator is protected by global lock, I think we do not need more |
@Hexiaoqiao Thanks for your suggestion. Could you please help close this PR and related JIRA? |
Got it. Close the PR since it does not need to fix. |
Description of PR
When I used multiple clients to write EC files concurrently, I found that NameNode generated the same block group ID for different files:
After diving into
SequentialBlockGroupIdGenerator
, I found that the current implementation ofnextValue
is not thread-safe.This problem must be fixed.
How was this patch tested?
Add a new unit test.