-
Notifications
You must be signed in to change notification settings - Fork 9k
HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed #5176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadAhead feature can reenabled back by default as we are undoing the known problem in the corruption issue reported before. Please include the change into this PR.
Also have some comments on tests. Please take a look.
movedToCompletedList.incrementAndGet(); | ||
return successOp; | ||
}) | ||
.when(client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is trying to unit test a bigger scope of existing inprogress buffer moving to completed list. Will be nice to scope the test to inProgressList and freelist counts, before and after close.
At this client.read() mock, I would suggest mocks that will invoke a large sleep for each read. That way after queueReadAheads call and a 1 sec sleep, 3 buffers will be stuck inProgessList and the freeeList should show 13 free. The asserts should continue to hold to same numbers post close as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have taken the change of sleep and assertion on freeList also included in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very brittle being timing based. normally I'd say "no" here, but I know I have a forthcoming pr which uses object.wait/notify to synchronize
https://github.com/apache/hadoop/pull/5117/files#diff-e829dbaa29faf05ae0b331439e9aec3cd02248464a097c86a0227783337b9b76R370
if this test causes problems it should do the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Steve, the sleep time on these mock threads are meant to hold the thread blocked while the test goes ahead with asserts after queuing reads and asserts after close. The sleep of 1 second (which will block the main test thread) after queueing reads has been consistent with the timing expectations with pre-existing tests in this class doing the same, however I agree that this test has lot more going beyond the close which needs time synchronization, which can make the test brittle.
Hi Pranav, The test asserts post line 566 starting from 3 sec sleep are validations for correct movement of inprogress buffers to completed list and their evictions, which is a functionality that this PR change does not interfere. I would suggest that we take them out and evaluate if pre-existing test coverage doesnt handle it already. If there are gaps, lets take it in separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have removed the assertion on inProgress to completedList and the eviction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may be slow, but at least there's no assertion that something finishes before a specific timeout. those are the tests which really have problems on slow networks/overloaded systems
* finally should get evicted. | ||
*/ | ||
@Test | ||
public void testStreamPurgeDuringReadAheadCallExecutingWithSomeCompletedBuffers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test seems to be validating effect of purge on completedList. Does this validate any test scenario that is not already covered in HADOOP-17156 commit testcases ?
Also, do all test asserts by HADOOP-17156 still hold good after this PR change preventing inprogress list purge ?
Commit 69e50c7 (HADOOP-18528. Disable abfs prefetching by default (#5134)) is reverted in the PR on commit: 02d39ca. |
Changes and simpler tests look okay to me. Approving once the yetus build results are here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. LGTM
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the production code; a bit of minor tuning on the tests.
Normally I would reject the test on the basis that there are too many sleep() calls in it. Not only will this make the test slow but relying on time to synchronise makes for incredibly brittle tests which will fail intermittently especially on a scale runs when a machine is overloaded. We know about those don't we?
However, my larger work on the same issue should obsolete those tests and there I am using explicit Java object synchronisation. As long as we consider this an intermediate release then I will be okay with the delays.
Just put the values as some constants in the file rather than in bed it across the code, using _ to separate thousands and a capital L for long (avoids 1/l confusion). I have also suggested better AssertJ assertions. AssertJ is very powerful if you use it's more sophisticated assertions -I am still learning how to use them properly myself.
Note that while working on my patch I managed to trigger an NPE in the readRemote call after the stream is closed...some buffer in the stream being freed. This is why there are now some more checkNotClosed() probes before and after the http read in https://github.com/apache/hadoop/pull/5117/files#diff-087a22909e9480806e19673301a3dec896a7e4e97ad7fb058c66f844d676ee47
This patch is going to potentially encounter those failures. But it doesn't really matter, does it? Because the stream which has closed is not going to be reading the prefetched buffers and so will never see the NPEs. For this reason I don't believe we need to do the same change here.
@@ -82,6 +84,16 @@ public class TestAbfsInputStream extends | |||
REDUCED_READ_BUFFER_AGE_THRESHOLD * 10; // 30 sec | |||
private static final int ALWAYS_READ_BUFFER_SIZE_TEST_FILE_SIZE = 16 * ONE_MB; | |||
|
|||
@After | |||
public void afterTest() throws InterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
override teardown()
, call superclass. that way you know the order of things happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have taken it in new revision.
@After | ||
public void afterTest() throws InterruptedException { | ||
//thread wait so that previous test's inProgress buffers are processed and removed. | ||
Thread.sleep(10000l); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't like this as it potentially ladds 10s to a test run, one which could still be a bit flaky.
what about using testResetReadBufferManager()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have refactored in the new revision:
- Override of teardown()
- Usage of testResetReadBufferManager
= ReadBufferManager.getBufferManager(); | ||
|
||
//Sleeping to give ReadBufferWorker to pick the readBuffers for processing. | ||
Thread.sleep(1000l); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1_000L
inputStream)) | ||
.describedAs("InProgressList should have 3 elements") | ||
.isEqualTo(3); | ||
Assertions.assertThat(readBufferManager.getFreeListCopy().size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use .hasSize(13) in the assert, so assertj will provide info about the list if there's a mismatch
readBufferManager.getCompletedReadListCopy(), inputStream)) | ||
.describedAs("CompletedList should have 0 elements") | ||
.isEqualTo(0); | ||
Assertions.assertThat(readBufferManager.getFreeListCopy().size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use .hasSize(13)
readBufferManager.getCompletedReadListCopy(), inputStream)) | ||
.describedAs("CompletedList should have 0 elements") | ||
.isEqualTo(0); | ||
Assertions.assertThat(readBufferManager.getFreeListCopy().size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use .hasSize()
|
||
private int getStreamRelatedBufferCount(final List<ReadBuffer> bufferList, | ||
final AbfsInputStream inputStream) { | ||
int count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer java8 streaming
bufferList.stream()
.filter(buffer -> buffer.getStream() == inputStream)
.count()
movedToCompletedList.incrementAndGet(); | ||
return successOp; | ||
}) | ||
.when(client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very brittle being timing based. normally I'd say "no" here, but I know I have a forthcoming pr which uses object.wait/notify to synchronize
https://github.com/apache/hadoop/pull/5117/files#diff-e829dbaa29faf05ae0b331439e9aec3cd02248464a097c86a0227783337b9b76R370
if this test causes problems it should do the same
…ndency on thread.sleep for synchronization.^
…ove dependency on thread.sleep for synchronization.^" This reverts commit fc833f2.
…ch has been done for existing test.
:::: AGGREGATED TEST RESULT :::: HNS-OAuth[INFO] Results: HNS-SharedKey[INFO] Results: NonHNS-SharedKey[INFO] Results: AppendBlob-HNS-OAuth[INFO] Results: Time taken: 40 mins 46 secs. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy with all the production code; just tuning tests.
now, has anyone tried a spark standalone cluster with a build of hadoop trunk without then with this patch to verify all is good there?
i can do this, but it is good for others to try too
Hi Steve, Unfortunately we didn't get to the standalone cluster checks. We will try the setup and test on Monday. Thanks. |
sorry, should have been clearer: a local spark build and spark-shell process is ideal for replication and validation -as all splits are processed in different worker threads in that process, it recreates the exact failure mode. script you can take and tune for your system; uses the mkcsv command in cloudstore JAR. I am going to add this as a scalatest suite in the same module |
🎊 +1 overall
This message was automatically generated. |
Thanks for the script. I had applied following changes on the script: saxenapranav/cloud-integration@1d779f2. On trunk's jar, got exceptions:
Using the jar of the PR's code and ran multiple times:
Commands executed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one final change; the cleanup of the input stream in the test.
giving a +1 pending that, and I'm going to test this through spark today ... writing a test to do replicate the failure and then verify that all is good when the jar is update
Assertions.assertThat(readBufferManager.getFreeListCopy()) | ||
.describedAs("FreeList should have 13 elements") | ||
.hasSize(13); | ||
.describedAs("FreeList should have " + freeListBufferCount + "elements") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can actually use string.format patterns here; most relevant for on demand toString calls which are more expensive. I'm not worrying about it here though
Thanks. We are doing inputStream.close() at https://github.com/apache/hadoop/pull/5176/files#diff-bdc464e1bfa3d270e552bdf740fc29ec808be9ab2c4f77a99bf896ac605a5698R546. Kindly advise please what is expected from the inputStream cleanup. I agree to the comment for String.format, I shall refactor the code accordingly. Regards. |
.describedAs("CompletedList should have 0 elements") | ||
.hasSize(0); | ||
|
||
inputStream.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem with the close() here is that it will only be reached if the assertions hold. if anything goes wrong, an exception is raised and the stream kept open, with whatever resources it consumes.
it should be closed in 10000 a finally block or the stream opened in a try-with-resources clause. thanks
clarified the cleanup problem |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
approved. I'm going to cp into 3.3, run the tests and see how that merges. If you can do the PR and test that'd be good too.
I've been working on real tests through spark, problems building spark with hadoop trunk mean i can't really test until the fix isn't a version of 3.3 with the PR. plan
- tests to show the bug (WiP) on branch-3.3
- verify it goes away with the PR
getting a test failure locally, ITestReadBufferManager failing as one of its asserts isn't valid. going to reopen the jira
|
it's a race condition in the test, which is why you didn't see it...different machine, network etc. |
Thanks. I am checking on it. |
update: full e2e tests through spark shell are happy! i was trying to do scalatest tests for this but not been able to replicate the test failure through my test suite (which rebuilds the .csv file every run, so was also v. slow). with manual tests running and #5198 in then all is good. |
… stream close() (#5176) This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing across concurrent HTTP requests" by not trying to cancel in progress reads. It supercedes HADOOP-18528, which disables the prefetching. If that patch is applied *after* this one, prefetching will be disabled. As well as changing the default value in the code, core-default.xml is updated to set fs.azure.enable.readahead = true As a result, if Configuration.get("fs.azure.enable.readahead") returns a non-null value, then it can be inferred that it was set in or core-default.xml (the fix is present) or in core-site.xml (someone asked for it). Note: this commit contains the followup commit: That is needed to avoid race conditions in the test. Contributed by Pranav Saxena.
… stream close() (#5176) This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing across concurrent HTTP requests" by not trying to cancel in progress reads. It supercedes HADOOP-18528, which disables the prefetching. If that patch is applied *after* this one, prefetching will be disabled. As well as changing the default value in the code, core-default.xml is updated to set fs.azure.enable.readahead = true As a result, if Configuration.get("fs.azure.enable.readahead") returns a non-null value, then it can be inferred that it was set in or core-default.xml (the fix is present) or in core-site.xml (someone asked for it). Note: this commit contains the followup commit: HADOOP-18546. Followup: ITestReadBufferManager fix (#5198) That is needed to avoid race conditions in the test. Contributed by Pranav Saxena.
(oh, and on my personal backport I have added a TRACE log in the buffer manager to record its state; abfsInputStream.toString does it too.
think i will retain those internally for a debug option |
#5205 is another followup with the logging and a probe through path capabilities; this allows me to verify that backports are in. an abfs instance is vulnerable if fs.hasPathcapability("fs.capability.paths.acls") && !fs.hasPathcapability("HADOOP-18546") if that holds, then you need to make sure readahead is disabled/no queue depth. setting queue depth is the one guaranteed to work everywhere. |
… stream close() (apache#5176) This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing across concurrent HTTP requests" by not trying to cancel in progress reads. It supercedes HADOOP-18528, which disables the prefetching. If that patch is applied *after* this one, prefetching will be disabled. As well as changing the default value in the code, core-default.xml is updated to set fs.azure.enable.readahead = true As a result, if Configuration.get("fs.azure.enable.readahead") returns a non-null value, then it can be inferred that it was set in or core-default.xml (the fix is present) or in core-site.xml (someone asked for it). Contributed by Pranav Saxena.
Backport HADOOP-18546. ABFS:disable purging list of in progress reads in abfs stream closed apache#5176
JIRA: https://issues.apache.org/jira/browse/HADOOP-18546
Details:
AbfsInputStream.close() can trigger the return of buffers used for active prefetch GET requests into the ReadBufferManager free buffer pool.
A subsequent prefetch by a different stream in the same process may acquire this same buffer. This can lead to risk of corruption of its own prefetched data, data which may then be returned to that other thread.
Parent JIRA: https://issues.apache.org/jira/browse/HADOOP-18521
In this PR, we are disabling the purging of the inprogressList. The readBuffers in InProgressList will get to ReadBufferWorker and get processed and finally get into completedList. After a thresholdAgeMilliseconds, the readBuffer would be evicted (https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ReadBufferManager.java#L280-L285)
Commit 69e50c7 (HADOOP-18528. Disable abfs prefetching by default (HADOOP-18528. Disable readAhead/prefetching by default #5134)) is reverted in the PR on commit: 02d39ca.
Config "fs.azure.enable.readahead" has value "true" in hadoop-common/src/main/resources/core-default.xml.
Testing:
HNS-OAuth
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 109, Failures: 1, Errors: 0, Skipped: 1
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataContributor:84 » AccessDenied Operat...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataReader:143 » AccessDenied Operation ...
[INFO]
[ERROR] Tests run: 566, Failures: 0, Errors: 3, Skipped: 98
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_120_terasort:262->executeStage:206 » IO The ownership o...
[INFO]
[ERROR] Tests run: 335, Failures: 0, Errors: 1, Skipped: 54
HNS-SharedKey
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 109, Failures: 1, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestAzureBlobFileSystemRandomRead.testSkipBounds:218->Assert.assertTrue:42->Assert.fail:89 There should not be any network I/O (elapsedTimeMs=32).
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o...
[INFO]
[ERROR] Tests run: 566, Failures: 1, Errors: 1, Skipped: 54
[INFO] Results:
[INFO]
[WARNING] Tests run: 335, Failures: 0, Errors: 0, Skipped: 41
NonHNS-SharedKey
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[ERROR] TestAbfsInputStream.testStreamPurgeDuringReadAheadCallExecuting:590 [CompletedList should have 0 elements] expected:<[0]> but was:<[2]>
[ERROR] TestAbfsInputStream.testStreamPurgeDuringReadAheadCallExecutingWithSomeCompletedBuffers:686 [CompletedList should have 0 elements] expected:<[0]> but was:<[2]>
[INFO]
[ERROR] Tests run: 109, Failures: 3, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:344->lambda$testAcquireRetry$6:345 » TestTimedOut
[INFO]
[ERROR] Tests run: 566, Failures: 0, Errors: 1, Skipped: 276
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestAbfsTerasort.test_110_teragen:244->executeStage:211->Assert.assertEquals:647->Assert.failNotEquals:835->Assert.fail:89 teragen(1000, abfs://testcontainer@pranavsaxenanonhns.dfs.core.windows.net/ITestAbfsTerasort/sortin) failed expected:<0> but was:<1>
[ERROR] Errors:
[ERROR] ITestAbfsJobThroughManifestCommitter.test_0420_validateJob » OutputValidation ...
[ERROR] ITestAbfsManifestCommitProtocol.testCommitLifecycle » OutputValidation
abfs:/... [ERROR] ITestAbfsManifestCommitProtocol.testCommitterWithDuplicatedCommit » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testConcurrentCommitTaskWithSubDir » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testMapFileOutputCommitter » OutputValidation ... [ERROR] ITestAbfsManifestCommitProtocol.testOutputFormatIntegration » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testParallelJobsToAdjacentPaths » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testTwoTaskAttemptsCommit » OutputValidation
...[INFO]
[ERROR] Tests run: 335, Failures: 1, Errors: 8, Skipped: 46
AppendBlob-HNS-OAuth
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[ERROR] TestAbfsInputStream.testStreamPurgeDuringReadAheadCallExecuting:590 [CompletedList should have 0 elements] expected:<[0]> but was:<[2]>
[INFO]
[ERROR] Tests run: 109, Failures: 2, Errors: 0, Skipped: 1
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataContributor:84 » AccessDenied Operat...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataReader:143 » AccessDenied Operation ...
[INFO]
[ERROR] Tests run: 566, Failures: 0, Errors: 3, Skipped: 98
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_120_terasort:262->executeStage:206 » IO The ownership o...
[INFO]
[ERROR] Tests run: 335, Failures: 0, Errors: 1, Skipped: 54