-
Notifications
You must be signed in to change notification settings - Fork 9k
HDFS-16855. Remove the redundant write lock in addBlockPool. #5170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
8000Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
@dingshun3016 This seems to only happen when invoke addBlockPool() and CachingGetSpaceUsed#used < 0, so why not handle it for example like forbid refresh() when ReplicaCachingGetSpaceUsed#init() at first time ? |
@MingXiangLi thanks reply forbid refresh() when ReplicaCachingGetSpaceUsed #init() at first time,it will cause the value of dfsUsage to be 0 until the next time refresh(). if remove the BLOCK_POOl level write lock in the org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl#addBlockPool(String bpid, Configuration conf) method, what will be the impact ? do you have any other suggestions? |
The BLOCK_POOl level lock is to protect replica consistency for FsDataSetImpl when read or write operating happend at same time.
For example we can use df command instead at first time or other way. On my side It's less risky to change ReplicaCachingGetSpaceUsed logic than remove the write lock. |
according to the situation discussed so far, it seems that there are several ways to solve this problem
Now that, this case only happen when invoke addBlockPool() and CachingGetSpaceUsed#used < 0, I have an idea, is it possible to add a switch, not add lock when ReplicaCachingGetSpaceUsed#init() at first time , and add it at other times do you think it's possible?@MingXiangLi |
💔 -1 overall
This message was automatically generated. |
This makes sense to me, get replicas usage message no need strong consistency.@Hexiaoqiao any suggestion? |
Thanks for the detailed discussions. +1. it seems good to me. |
Thanks replay. looks like this PR HDFS-14986, forbid refresh() when ReplicaCachingGetSpaceUsed#init() at first time. |
When patching the datanode's fine-grained lock, we found that the datanode couldn't start,maybe happened deadlock,when addBlockPool, so we can remove it.
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl#addBlockPool get writeLock
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl#deepCopyReplica need readLock
because it is not the same thread, so the write lock cannot be downgraded to a read lock