HDFS-16942. Send error to datanode if FBR is rejected due to bad lease #5460

sodonnel · 2023-03-07T14:34:27Z

Description of PR

When a datanode sends a FBR to the namenode, it requires a lease to send it. On a couple of busy clusters, we have seen an issue where the DN is somehow delayed in sending the FBR after requesting the least. Then the NN rejects the FBR and logs a message to that effect, but from the Datanodes point of view, it thinks the report was successful and does not try to send another report until the 6 hour default interval has passed.

If this happens to a few DNs, there can be missing and under replicated blocks, further adding to the cluster load. Even worse, I have see the DNs join the cluster with zero blocks, so it is not obvious the under replication is caused by lost a FBR, as all DNs appear to be up and running.

I believe we should propagate an error back to the DN if the FBR is rejected, that way, the DN can request a new lease and try again.

How was this patch tested?

Modified a test and added a new test. Also validated manually by changing the code to always throw the InvalidLease exception to ensure the DN handled it, reset the least ID and retried.

hadoop-yetus · 2023-03-07T20:30:25Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 57s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 2 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	40m 48s		trunk passed
+1 💚	compile	1m 30s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	1m 21s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	checkstyle	1m 5s		trunk passed
+1 💚	mvnsite	1m 33s		trunk passed
+1 💚	javadoc	1m 6s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 30s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	3m 41s		trunk passed
+1 💚	shadedclient	22m 57s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 18s		the patch passed
+1 💚	compile	1m 20s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	1m 20s		the patch passed
+1 💚	compile	1m 15s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	javac	1m 15s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 51s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 79 unchanged - 0 fixed = 81 total (was 79)
+1 💚	mvnsite	1m 25s		the patch passed
+1 💚	javadoc	0m 53s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 31s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	3m 26s		the patch passed
+1 💚	shadedclient	23m 53s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	243m 39s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 51s		The patch does not generate ASF License warnings.
		354m 37s

Reason	Tests
Failed junit tests	hadoop.hdfs.TestRollingUpgrade

Subsystem	Report/Notes
Docker	ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/1/artifact/out/Dockerfile
GITHUB PR	#5460
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux e8a6e148e1c3 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `b201655`
Default Java	Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/1/testReport/
Max. process+thread count	3042 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

sodonnel · 2023-03-07T21:26:13Z

Not sure what to do about this checkstyle error:

./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/InvalidBlockReportLeaseException.java:1:/**: Missing package-info.java file. [JavadocPackage]

If I add the file:

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/package-info.java

I get an enforcer error:

[INFO] -------< org.apache.hadoop:hadoop-client-check-test-invariants >--------
[INFO] Building Apache Hadoop Client Packaging Invariants for Test 3.4.0-SNAPSHOT [106/113]
[INFO] --------------------------------[ pom ]---------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:3.1.0:clean (default-clean) @ hadoop-client-check-test-invariants ---
[INFO] Deleting /Users/sodonnell/source/upstream_hadoop/hadoop-client-modules/hadoop-client-check-test-invariants/target
[INFO] Deleting /Users/sodonnell/source/upstream_hadoop/hadoop-client-modules/hadoop-client-check-test-invariants (includes = [dependency-reduced-pom.xml], excludes = [])
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-client-check-test-invariants ---
[INFO] Executing tasks

main:
    [mkdir] Created dir: /Users/sodonnell/source/upstream_hadoop/hadoop-client-modules/hadoop-client-check-test-invariants/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-enforcer-plugin:3.0.0:enforce (enforce-banned-dependencies) @ hadoop-client-check-test-invariants ---
[INFO] Adding ignore: module-info
[INFO] Adding ignore: META-INF/versions/*/module-info
[INFO] Adding ignorable dependency: org.apache.hadoop:hadoop-annotations:null
[INFO]   Adding ignore: *
[WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses failed with message:
Duplicate classes found:

  Found in:
    org.apache.hadoop:hadoop-client-minicluster:jar:3.4.0-SNAPSHOT:compile
    org.apache.hadoop:hadoop-client-api:jar:3.4.0-SNAPSHOT:compile
  Duplicate classes:
    org/apache/hadoop/hdfs/server/protocol/package-info.class

Hexiaoqiao · 2023-03-08T02:20:14Z

...project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java

@@ -791,6 +792,9 @@ private void offerService() throws Exception {
          shouldServiceRun = false;
          return;
        }
+        if (InvalidBlockReportLeaseException.class.getName().equals(reClass)) {
+          fullBlockReportLeaseId = 0;


@sodonnel Thanks for your works here. Do we also need set forceFullBlockReport to true here? Otherwise, datanode will send block report at next 6 hour by default, right?

At line 717, we can see where it attempts to get a lease from the heartbeat if the lease in the DN == 0:

boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) && scheduler.isBlockReportDue(startTime);

So its the isBlockReportDue that controls this. Then later, if we have a non zero lease, it will try to create the block report:

boolean forceFullBr = scheduler.forceFullBlockReport.getAndSet(false); if (forceFullBr) { LOG.info("Forcing a full block report to " + nnAddr); } if ((fullBlockReportLeaseId != 0) || forceFullBr) { cmds = blockReport(fullBlockReportLeaseId); fullBlockReportLeaseId = 0; }

Its really the isBlockReportDue() method that controls whether a new one should be sent of not, and that is based on time since the last one. The the blockReport(), it updates the time after a successful block report, but if it gets an exception, like this change causes, it will not update the time and so it will try again on the next heartbeat if it gets a new lease.

I think forceFullBlockReport is only for tests, or the command to force a DN block from the CLI.

Also, do you have any idea about fixing the checkstyle issue? As I mentioned above, trying to add a package-info.java file broke my compile locally.

As we don't expect to reach here frequently (hopefully datanode is able to acquire lease successfully most of the times), perhaps we can add one liner log to indicate that the particular FBR went through this trouble (i.e. log report id and lease id)? (just in case if it helps debug further)

Or maybe reportId and leaseId can be added as constructor argument to InvalidBlockReportLeaseException. This way RemoteException in offerService log will likely print them anyways?

Its really the isBlockReportDue() method that controls whether a new one should be sent of not, and that is based on time since the last one. The the blockReport(), it updates the time after a successful block report, but if it gets an exception, like this change causes, it will not update the time and so it will try again on the next heartbeat if it gets a new lease.

Thanks for the detailed explain. Make sense to me.

perhaps we can add one liner log to indicate that the particular FBR went through this trouble (i.e. log report id and lease id)

+1 from my side.

I added the report and lease ID into the exception message, so it will get logged as part of the exception.

Still stuck on that checkstyle violation :-(

hadoop-yetus · 2023-03-08T03:19:58Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 58s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 2 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	38m 41s		trunk passed
+1 💚	compile	1m 26s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	1m 24s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	checkstyle	1m 7s		trunk passed
+1 💚	mvnsite	1m 28s		trunk passed
+1 💚	javadoc	1m 10s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 34s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	3m 25s		trunk passed
+1 💚	shadedclient	22m 34s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 15s		the patch passed
+1 💚	compile	1m 19s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	1m 19s		the patch passed
+1 💚	compile	1m 11s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	javac	1m 11s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 51s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 79 unchanged - 0 fixed = 80 total (was 79)
+1 💚	mvnsite	1m 22s		the patch passed
+1 💚	javadoc	0m 50s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 22s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	3m 17s		the patch passed
+1 💚	shadedclient	22m 25s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	238m 22s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 51s		The patch does not generate ASF License warnings.
		345m 2s

Subsystem	Report/Notes
Docker	ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/2/artifact/out/Dockerfile
GITHUB PR	#5460
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 1c427e3c3acb 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `15851db`
Default Java	Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/2/testReport/
Max. process+thread count	3164 (vs. ul 8000 imit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

virajjasani · 2023-03-08T23:14:00Z

Duplicate classes found:

  Found in:
    org.apache.hadoop:hadoop-client-minicluster:jar:3.4.0-SNAPSHOT:compile
    org.apache.hadoop:hadoop-client-api:jar:3.4.0-SNAPSHOT:compile
  Duplicate classes:
    org/apache/hadoop/hdfs/server/protocol/package-info.class

I believe this could be avoided by excluding it from either of the poms:

--- a/hadoop-client-modules/hadoop-client-api/pom.xml
+++ b/hadoop-client-modules/hadoop-client-api/pom.xml
@@ -126,6 +126,12 @@
                         <exclude>org/apache/hadoop/yarn/client/api/package-info.class</exclude>
                       </excludes>
                     </filter>
+                    <filter>
+                      <artifact>org.apache.hadoop:hadoop-hdfs</artifact>
+                      <excludes>
+                        <exclude>org/apache/hadoop/hdfs/server/protocol/package-info.class</exclude>
+                      </excludes>
+                    </filter>
                   </filters>
                   <relocations>
                     <relocation>

For instance, the above patch might help. I haven't tested this but I guess it might work.

virajjasani · 2023-03-08T23:50:27Z

@sodonnel also I am curious, was it just a specific log (only one case e.g. the lease was expired) or combination of logs from checkLease(DatanodeDescriptor dn, long monotonicNowMs, long id) that you have seen in various issues?

I wonder if lease expiry or invalid lease are worth having some dedicated metrics in NameNodeActivity (maybe not as with this patch, the subsequent attempt by BP actor should anyways have new lease id acquired from the response of heartbeat API before it reattempts sending FBR).

sodonnel · 2023-03-09T22:21:37Z

@sodonnel also I am curious, was it just a specific log (only one case e.g. the lease was expired) or combination of logs from checkLease(DatanodeDescriptor dn, long monotonicNowMs, long id) that you have seen in various issues?

I wonder if lease expiry or invalid lease are worth having some dedicated metrics in NameNodeActivity (maybe not as with this patch, the subsequent attempt by BP actor should anyways have new lease id acquired from the response of heartbeat API before it reattempts sending FBR).

In the examples I saw, its was expired leases that caused the problem. However the namenode was under significant pressure when it happened. In one example, it was actually the SBNN which was rejecting the reports. Tailing the edits was taking frequent long locks (over 300 seconds at time) which was beyond the lease expiry.

In another example, it was the ANN after startup. I am not sure, but I think the system perhaps out of safemode with many block reports still outstanding, and then between under replication and IBRs, contention on the NN lock seemed to block the FBRs until the lease expired.

sodonnel · 2023-03-09T22:45:18Z

Duplicate classes found:

  Found in:
    org.apache.hadoop:hadoop-client-minicluster:jar:3.4.0-SNAPSHOT:compile
    org.apache.hadoop:hadoop-client-api:jar:3.4.0-SNAPSHOT:compile
  Duplicate classes:
    org/apache/hadoop/hdfs/server/protocol/package-info.class

I believe this could be avoided by excluding it from either of the poms:

--- a/hadoop-client-modules/hadoop-client-api/pom.xml
+++ b/hadoop-client-modules/hadoop-client-api/pom.xml
@@ -126,6 +126,12 @@
                         <exclude>org/apache/hadoop/yarn/client/api/package-info.class</exclude>
                       </excludes>
                     </filter>
+                    <filter>
+                      <artifact>org.apache.hadoop:hadoop-hdfs</artifact>
+                      <excludes>
+                        <exclude>org/apache/hadoop/hdfs/server/protocol/package-info.class</exclude>
+                      </excludes>
+                    </filter>
                   </filters>
                   <relocations>
                     <relocation>

For instance, the above patch might help. I haven't tested this but I guess it might work.

@virajjasani This isn't working. I added what you suggested, but still the same error. I also don't understand why this style warning is appearing, as its not a new package folder, and it has plenty of other classes in it already.

virajjasani · 2023-03-10T00:01:28Z

I also don't understand why this style warning is appearing, as its not a new package folder, and it has plenty of other classes in it already.

I think it's appearing because it was not resolved earlier or it was ignored. For instance, I see the last time the module was touched was by this PR #4252 and in the last QA result, I see checkstyle issues, but of course since the PR is quite old, we can't see what the error was (it reports 404 for the checkstyle report page).
I guess it's not that important either.

This isn't working. I added what you suggested, but still the same error.

Strange, I think this one should hopefully work:

--- a/hadoop-client-modules/hadoop-client-api/pom.xml
+++ b/hadoop-client-modules/hadoop-client-api/pom.xml
@@ -126,6 +126,12 @@
                         <exclude>org/apache/hadoop/yarn/client/api/package-info.class</exclude>
                       </excludes>
                     </filter>
+                    <filter>
+                      <artifact>org.apache.hadoop:*</artifact>
+                      <excludes>
+                        <exclude>org/apache/hadoop/hdfs/server/protocol/package-info.class</exclude>
+                      </excludes>
+                    </filter>
                   </filters>
                   <relocations>
                     <relocation>

hadoop-yetus · 2023-03-10T04:53:42Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 1s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 2 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	38m 29s		trunk passed
+1 💚	compile	1m 25s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	1m 20s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	checkstyle	1m 10s		trunk passed
+1 💚	mvnsite	1m 30s		trunk passed
+1 💚	javadoc	1m 9s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 35s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	3m 27s		trunk passed
+1 💚	shadedclient	22m 52s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 22s		the patch passed
+1 💚	compile	1m 18s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	1m 18s		the patch passed
+1 💚	compile	1m 12s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	javac	1m 12s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 53s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 79 unchanged - 0 fixed = 81 total (was 79)
+1 💚	mvnsite	1m 19s		the patch passed
+1 💚	javadoc	0m 51s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 28s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	3m 16s		the patch passed
+1 💚	shadedclient	22m 30s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	241m 30s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 50s		The patch does not generate ASF License warnings.
		348m 18s

Reason	Tests
Failed junit tests	hadoop.hdfs.TestRollingUpgrade

Subsystem	Report/Notes
Docker	ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/3/artifact/out/Dockerfile
GITHUB PR	#5460
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 8bfa574c9407 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `9604ad2`
Default Java	Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/3/testReport/
Max. process+thread count	2909 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/3/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

sodonnel · 2023-03-10T10:21:08Z

Strange, I think this one should hopefully work:

--- a/hadoop-client-modules/hadoop-client-api/pom.xml
+++ b/hadoop-client-modules/hadoop-client-api/pom.xml
@@ -126,6 +126,12 @@
                         <exclude>org/apache/hadoop/yarn/client/api/package-info.class</exclude>
                       </excludes>
                     </filter>
+                    <filter>
+                      <artifact>org.apache.hadoop:*</artifact>
+                      <excludes>
+                        <exclude>org/apache/hadoop/hdfs/server/protocol/package-info.class</exclude>
+                      </excludes>
+                    </filter>
                   </filters>
                   <relocations>
                     <relocation>

@virajjasani Thanks! That has fixed the compile. I just pushed the change, so hopefully we will get a green build this time.

hadoop-yetus · 2023-03-10T17:52:39Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 2s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+0 🆗	xmllint	0m 1s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 2 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	15m 25s		Maven dependency ordering for branch
+1 💚	mvninstall	28m 32s		trunk passed
+1 💚	compile	25m 37s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	21m 52s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	checkstyle	4m 9s		trunk passed
+1 💚	mvnsite	2m 13s		trunk passed
+1 💚	javadoc	1m 45s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	2m 9s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+0 🆗	spotbugs	0m 30s		branch/hadoop-client-modules/hadoop-client-api no spotbugs output file (spotbugsXml.xml)
+1 💚	shadedclient	23m 1s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 24s		Maven dependency ordering for patch
+1 💚	mvninstall	4m 35s		the patch passed
+1 💚	compile	24m 38s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	24m 38s		the patch passed
+1 💚	compile	21m 52s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	javac	21m 52s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	3m 52s		the patch passed
+1 💚	mvnsite	2m 8s		the patch passed
+1 💚	javadoc	1m 38s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	2m 14s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+0 🆗	spotbugs	0m 29s		hadoop-client-modules/hadoop-client-api has no data from spotbugs
+1 💚	shadedclient	22m 49s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	225m 51s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	unit	0m 36s		hadoop-client-api in the patch passed.
+1 💚	asflicense	1m 1s		The patch does not generate ASF License warnings.
		451m 39s

Reason	Tests
Failed junit tests	hadoop.hdfs.server.datanode.TestDirectoryScanner
	hadoop.hdfs.TestRollingUpgrade
	hadoop.hdfs.server.namenode.TestFsck

Subsystem	Report/Notes
Docker	ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/4/artifact/out/Dockerfile
GITHUB PR	#5460
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname	Linux 0c2731d15925 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `0b4fe8e`
Default Java	Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/4/testReport/
Max. process+thread count	2266 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs hadoop-client-modules/hadoop-client-api U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5460/4/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

virajjasani · 2023-03-10T17:55:39Z

The reported failures are just known flakes, not related.

+1 (non-binding) from my side, thank you @sodonnel @Hexiaoqiao

sodonnel · 2023-03-10T17:57:46Z

@Hexiaoqiao Are you happy with the latest revision to give a +1 so I can commit? Thanks.

Hexiaoqiao

LGTM. +1 from my side. The failed unit tests seems not related with this PR.
Thanks @sodonnel and @virajjasani for your works.

#5460) (cherry picked from commit ca6f5af) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockReportLease.java

#5460) (cherry picked from commit d7b89d0) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockReportLease.java

ayushtkn · 2023-03-14T15:09:19Z

This wasn't the fix: #5460 (comment), rather it broke the javadoc build.
Shoot a mvn clean site and find an exception like

[ERROR] /hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/package-info.java:22: error: package org.apache.hadoop.hdfs.server.protocol has already been annotated
[ERROR] @InterfaceAudience.Private
[ERROR] ^
[ERROR] java.lang.AssertionError
[ERROR] 	at com.sun.tools.javac.util.Assert.error(Assert.java:126)
[ERROR] 	at com.sun.tools.javac.util.Assert.check(Assert.java:45)
[ERROR] 	at com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:177)
[ERROR] 	at com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
[ERROR] 	at com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
[ERROR] 	at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
[ERROR] 	at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
[ERROR] 	at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
[ERROR] 	at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
[ERROR] 	at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
[ERROR] 	at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
[ERROR] 	at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
[ERROR] 	at com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)

Now the original problem
That @sodonnel mentioned over here #5460 (comment)
That the enforcer is giving an exception once he adds package-info.java

  Duplicate classes:
    org/apache/hadoop/hdfs/server/protocol/package-info.class

And the fix went in the direction this enforcer has gone crazy, lets filter this file itself, but that poor fellow wasn't doing anything wrong :)

Check the file-1
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/protocol/package-info.java

Now your added File-2
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/package-info.java

One was already there in hdfs-client, now this got added for the same package on hdfs as well. Why same package on client and hdfs jar, I think all of us here know those reasons, so not getting into that...

@sodonnel can you delete this new package-info.java file. And we can fix the build post writing "Checkstyle warnings is irrelevant/unavoidable"

virajjasani · 2023-03-14T15:17:27Z

Or, we just need to remove IA.Private annotation from package-info.java? It seems we don't put IA.Private on any of package-info.java

virajjasani · 2023-03-14T15:18:57Z

That way in future also, if anyone makes changes in hadoop-hdfs module's hdfs/server/protocol, then jenkins won't given any redundant warning?

sodonnel · 2023-03-14T15:19:53Z

Do you want just a followup PR to remove the file, or is just removing the IA.private line enough? TBH, I just copied the contents from another package-info file.

ayushtkn · 2023-03-14T15:21:59Z

Just remove the file and the filter. no need to have dupes. many package-info must be having those IA.private.

TBH, I just copied the contents from another package-info file.

that's what all of us do :)

virajjasani · 2023-03-14T15:33:07Z

Ah, looks like we have IA annotations in some and not in others. Looks like removing might be better. Thanks!

I wonder how do we deal with it in future though? When someone makes changes only in hadoop-hdfs module's protocol package, the checkstyle failure will unnecessarily get attention?

sodonnel · 2023-03-14T15:36:45Z

But will the checkstyle not affect all future PRs? I think I have seen that happen before, where a new checkstyle issue comes in, and then future PRs are impacted by it, but I may be wrong.

I will remove the file and see what it does.

virajjasani · 2023-03-14T15:38:49Z

But will the checkstyle not affect all future PRs?

Correct, that's what I was wondering :)

ayushtkn · 2023-03-14T15:41:27Z

But will the checkstyle not affect all future PRs? I think I have seen that happen before, where a new checkstyle issue comes in, and then future PRs are impacted by it, but I may be wrong.

How? You folks mean the new PR will also show this checstyle warning? So, our checkstyle works by computing diff b/w the trunk and the patch and flags when the number changes.

if you check the warning above:

hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 79 unchanged - 0 fixed = 81 total (was 79)

There were already 79, the PR had 2 more, so that is what it flags

@sodonnel Ozone operates in different way to figure out checkstyle, may be you would have seen there, it would be true there most probably

virajjasani · 2023-03-14T16:17:20Z

How? You folks mean the new PR will also show this checstyle warning?

Correct, I am only worried about the amount of time folks will end up wasting in future for such warning (everytime new file is added or perhaps even existing file is modified under the package?):

./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/{new-file-name}.java:1:/**: Missing package-info.java file. [JavadocPackage]

As was the case with this PR:

sodonnel · 2023-03-14T19:36:43Z

Posted #5478 to address the problem

apache#5460)

HDFS-16942. Send error to datanode if FBR is rejected due to bad lease

b201655

Fix checkstyle

15851db

Hexiaoqiao reviewed Mar 8, 2023

View reviewed changes

Address review comments

9604ad2

S O'Donnell added 2 commits March 10, 2023 10:19

Fix package-info checkstyle

6d77c55

Fix further checkstyle

0b4fe8e

Hexiaoqiao approved these changes Mar 11, 2023

View reviewed changes

sodonnel merged commit ca6f5af into apache:trunk Mar 11, 2023

virajjasani mentioned this pull request Mar 15, 2023

HDFS-16942. Addendum. Send error to datanode if FBR is rejected due t… #5478

Merged

ferdelyi pushed a commit to ferdelyi/hadoop that referenced this pull request May 26, 2023

HDFS-16942. Send error to datanode if FBR is rejected due to bad lease (

739f38b

apache#5460)

HDFS-16942. Send error to datanode if FBR is rejected due to bad lease #5460

HDFS-16942. Send error to datanode if FBR is rejected due to bad lease #5460

Uh oh!

Conversation

Description of PR

How was this patch tested?

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!