-
Notifications
You must be signed in to change notification settings - Fork 9k
HDFS-16911. Distcp with snapshot diff to support Ozone filesystem. #5364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- all new methods and fields need a javadoc
- what new tests do you propose? the simple one is "how to show the dynamic binding is used"
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpConstants.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
… call respective fs impl
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to comment on the snapshot code itself.
I'm worried that removing a constant and the ability to set a new sync class will break something.
Do you know if anyone has actually used this feature?
...ect/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotDiffReport.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpConstants.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java
Outdated
Show resolved
Hide resolved
// can expect other exceptions as source and target paths | ||
// are not created, assert if the exception is not arising | ||
// due to the filesystem not supporting snapshots. | ||
Assert.assertFalse(e.getMessage().contains("does not support snapshot")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you expecting an exception here? if so, intercept() has an overload where you declare text to look for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm checking that a particular exception i.e exception stating that 'fs doesn't support snapshots' is not thrown, The intercept utility method checks whether the caught exception has a particular text, I think there is no util method to check its negation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
Use GenericTestUtils.assertExceptionContains(does not support snapshot", e)
so on a failure, the unexpected exception message and stack is not lost. whoever has to debug a test failure from an html test report will love you for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After changing it to throw UnsupportedOperationException it became easier, The failure scenario for this test case is only if it gets UnsupportedException. It can expect other exceptions.
catch (UnsupportedOperationException e) {
Assert.fail("Dummy FS supports snapshots," +
"did not expect UnsupportedOperationException");
} catch (Exception e) {
// can expect other exceptions as source and target paths
// are not created.
}
Thanks @steveloughran for reviewing this PR. This constant is not present in the hadoop code but was added as part of a previous commit of this same PR, I'm no longer using that approach of setting the distcp.sync.class that I proposed earlier , instead just assigning the correct filesystem dynamically at run time did the job for me. |
thanks for the update; nothing to worry about. I was just concerned if some app was using distcp as a library (hive,...) and this would have broken it. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
} | ||
try { | ||
getSnapshotDiffReportMethod(srcFs); | ||
} catch (NoSuchMethodException e) { | ||
throw new IllegalArgumentException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to UnsupportedOperationException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -161,20 +161,32 @@ private void checkFilesystemSupport(Path sourceDir, Path targetDir, | |||
if (!srcFs.hasPathCapability(sourceDir, | |||
CommonPathCapabilities.FS_SNAPSHOTS)) { | |||
throw new IllegalArgumentException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnsupportedOperationException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -1305,7 +1305,7 @@ public void testSyncSnapshotDiffWithDummyFileSystem() throws Exception { | |||
new DistCp(conf, OptionsParser.parse(args)).execute(); | |||
} catch (Exception e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. catch UnsupportedOperationException
after tightening the exception raised
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// can expect other exceptions as source and target paths | ||
// are not created, assert if the exception is not arising | ||
// due to the filesystem not supporting snapshots. | ||
Assert.assertFalse(e.getMessage().contains("does not support snapshot")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
Use GenericTestUtils.assertExceptionContains(does not support snapshot", e)
so on a failure, the unexpected exception message and stack is not lost. whoever has to debug a test failure from an html test report will love you for this
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@@ -1303,12 +1303,13 @@ public void testSyncSnapshotDiffWithDummyFileSystem() throws Exception { | |||
FileSystem dummyFs = FileSystem.get(URI.create("dummy:///"), conf); | |||
Assert.assertTrue(dummyFs instanceof DummyFs); | |||
new DistCp(conf, OptionsParser.parse(args)).execute(); | |||
} catch (UnsupportedOperationException e) { | |||
Assert.fail("Dummy FS supports snapshots," + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you just rethrow e here, then the caller gets a full stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
"dummy:///target" }; | ||
try { | ||
FileSystem dummyFs = FileSystem.get(URI.create("dummy:///"), conf); | ||
Assert.assertTrue(dummyFs instanceof DummyFs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not very informative. I'd prefer an assertJ assertion that the class is an instance of,
assertThat(dummyfs).isInstanceOf()...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This comment was marked as outdated.
This comment was marked as outdated.
fs, snapshotDir, fromSnapshot, toSnapshot); | ||
} catch (InvocationTargetException e) { | ||
throw new IOException(e.getCause()); | ||
} catch (NoSuchMethodException|IllegalAccessException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NoSuchMethod sanity check already covered looks like before calling this API. IllegalAccess should noyt happen ideally. So runtime should be ok.
Tiny Nit: Message: "Failed to ...."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @umamaheswararao for the review.
Done. Yes the exception will likely be detected earlier but these are checked exceptions so it requires either to define in catch
block or throws
.
This patch looks good to me. If @steveloughran agree that his comments addressed, we can proceed to commit. Thanks |
This comment was marked as outdated.
This comment was marked as outdated.
If no other comments, i will proceed to merge this. Thanks |
…pache#5364) (cherry picked from commit 74ddf69)
Description of PR
Currently in
DistcpSync
i.e the step which applies the diff b/w 2 provided snapshots as arguments to the distcp job with-diff
option, onlyDistributedFilesystem
andWebHDFS
filesystems are supported.Now that Ozone supports snapshots, the change here is to make DistcpSync also support OzoneFilesystem (ofs).
The change checks if the filesystem suports fs.hasPathCapability for snapshots and any HCFS that supports snapshots can override hasPathCapability and return true if it does.
Jira :https://issues.apache.org/jira/browse/HDFS-16911
How was this patch tested?
Unit tests
Tested by building hadoop code and replacing distcp jars in Ozone via Unit tests in Ozone.
Also tested on a cluster.