8000 HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for SingleFilePerBlockCache by virajjasani · Pull Request #5754 · apache/hadoop · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for SingleFilePerBlockCache #5754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 14, 2023

Conversation

virajjasani
Copy link
Contributor

Jira: HADOOP-18291

@virajjasani
Copy link
Contributor Author

us-west-2:

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 18m 1s Maven dependency ordering for branch
+1 💚 mvninstall 22m 54s trunk passed
+1 💚 compile 18m 37s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 16m 16s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 25s trunk passed
+1 💚 mvnsite 2m 47s trunk passed
+1 💚 javadoc 2m 12s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 56s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 17s trunk passed
+1 💚 shadedclient 22m 41s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 29s the patch passed
+1 💚 compile 16m 29s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 29s the patch passed
+1 💚 compile 16m 21s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 21s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 19s the patch passed
+1 💚 mvnsite 2m 48s the patch passed
+1 💚 javadoc 2m 5s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 56s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 21s the patch passed
+1 💚 shadedclient 22m 55s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 15s hadoop-common in the patch passed.
+1 💚 unit 2m 52s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 17s The patch does not generate ASF License warnings.
218m 36s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/1/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e294eb7de530 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 282ddd1
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/1/testReport/
Max. process+thread count 2580 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

@mukund-thakur @mehakmeet could you please review this PR?

Copy link
Contributor
@mehakmeet mehakmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did an initial review at first glance. Looks good. Need to properly understand the locking and if there are any scenarios we need to think of and would review the tests next.

private Entry head;

/**
* Tail of the lined list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "linked"

/**
* Prefetch max blocks count config.
*/
public static final String FS_PREFETCH_MAX_BLOCKS_COUNT = "fs.prefetch.max.blocks.count";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a Constants class where we can move this to?

this.maxBlocksCount =
conf.getInt(FS_PREFETCH_MAX_BLOCKS_COUNT, DEFAULT_FS_PREFETCH_MAX_BLOCKS_COUNT);
Preconditions.checkArgument(this.maxBlocksCount > 0,
"prefetch blocks total capacity should be more than 0");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include the property name in the error message by which we can set this to a valid value

Comment on lines 338 to 342
if (tail != null) {
while (tail.getNext() != null) {
tail = tail.getNext();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit about this part, not able to get why this is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, let's say:

head -> 1, tail -> 2

new block: 3
so, we need to make: 3 -> 1 -> 2
i.e. new head -> 3, tail -> 2

new block: 4
updated list: 4 -> 3 -> 1 -> 2

now let's say input stream accesses block 2, hence block 2 needs to be put at the head.
new list should be: 2 -> 4 -> 3 -> 1

we change head to 2 and we also update next pointer of block 1
however, if we don't update tail (L322-L326), we will not be able to move tail to block 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but when we are adding the node to the head, doesn't it make more sense to check if the current node is tail, get the previous of this, and set that to tail? This would work, was just interested if we can avoid traversing the list 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, nice optimization for sure, let me double check this again, i ran through multiple test iterations and somehow found that this works for sure but let me check if the optimization works (i think it should but i am just wondering if i am missing some cases).

Thank you btw @mehakmeet !!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i applied this patch temporarily to debug further but somehow head and tail are getting screwed up:

diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java
index ef685b54d30..1aad82ff9c2 100644
--- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java
+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java
@@ -306,6 +306,7 @@ private void addToHeadOfLinkedList(Entry entry) {
           "Block num {} to be added to the head. Current head block num: {} and tail block num: {}",
           entry.blockNumber, head.blockNumber, tail.blockNumber);
       if (entry != head) {
+        boolean isEntryTail = entry == tail;
         Entry prev = entry.getPrevious();
         Entry nxt = entry.getNext();
         if (prev != null) {
@@ -318,10 +319,8 @@ private void addToHeadOfLinkedList(Entry entry) {
         entry.setNext(head);
         head.setPrevious(entry);
         head = entry;
-      }
-      if (tail != null) {
-        while (tail.getNext() != null) {
-          tail = tail.getNext();
+        if (isEntryTail) {
+          tail = prev;
         }
       }
     } finally {

however, somehow after eviction, the head and tail are getting screwed up, still trying to understand what is going wrong and why this patch would not work.
but i hope you were suggestion change like this one, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works with slight modification, let me push the change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, happy with the new change I think. Would be good to explicitly test certain tail changing scenarios in the IT like you mentioned above if we are not already doing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, i was actually thinking about it but there is just not clean way of doing so, hence what i have been able to do so far was by "logging" head and tail nodes (as you also mentioned earlier) with all other nodes, so that i could track the exact nodes sequence. that's the best way i could find so far, but extracting that info in IT is really difficult (if we were to do it in clean way).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I was thinking, would it be possible via a simple UT as well, where we pass in the entries as we desire and access them in our preferences to test functionality, might be easier if we directly test the LRU logic than via the stream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice idea, it might be more beneficial for UT to test this.

i am also planning to refactor Entry class on it's own new class rather than as an inner class of SingleFilePerBlockCache as part of next follow-up sub-task. once we do that, then it might be even more easier to write some UT to directly access head, tail pointers.

sorry, i was thinking this as sub-task so maybe adding UT can also be done with sub-task. does that sound good?

elementToPurge.takeLock(Entry.LockType.WRITE, PREFETCH_WRITE_LOCK_TIMEOUT,
PREFETCH_WRITE_LOCK_TIMEOUT_UNIT);
if (!lockAcquired) {
LOG.error("Cache file {} deletion would not be attempted as write lock could not"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there can be a scenario where the current cache exceeds its normal capacity? Is 5 seconds enough time? or are we okay with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are already using 5s at other place also (PREFETCH_WRITE_LOCK_TIMEOUT), used it here as well but happy to change it in future as/if we encounter some problem with this, does that sound good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we are okay with things not blowing up if eviction is not successful, are we okay with it? Can this hurt in the long run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be okay, in fact we have same logic for input stream close as well, if eviction or removal of disk block is unsuccessful, we are leaving them with a fat warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

if eviction misses it, stream close would be able to clean it up.
if stream close misses it, then it stays on disk and we might eventually also come up with some "file last accessed" based check and maybe some crons removing them eventually. not a bad idea IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, sounds good

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 42s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 17m 53s Maven dependency ordering for branch
+1 💚 mvninstall 20m 55s trunk passed
+1 💚 compile 17m 15s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 16m 10s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 27s trunk passed
+1 💚 mvnsite 2m 52s trunk passed
+1 💚 javadoc 2m 15s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 54s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 13s trunk passed
+1 💚 shadedclient 22m 51s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 27s the patch passed
+1 💚 compile 16m 9s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 9s the patch passed
+1 💚 compile 16m 7s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 21s the patch passed
+1 💚 mvnsite 2m 43s the patch passed
+1 💚 javadoc 2m 5s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 55s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
-1 ❌ spotbugs 2m 52s /new-spotbugs-hadoop-common-project_hadoop-common.html hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 shadedclient 22m 33s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 11s hadoop-common in the patch passed.
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 16s The patch does not generate ASF License warnings.
213m 51s
Reason Tests
SpotBugs module:hadoop-common-project/hadoop-common
Nullcheck of SingleFilePerBlockCache.tail at line 322 of value previously dereferenced in org.apache.hadoop.fs.impl.prefetch.SingleFilePerBlockCache.addToHeadOfLinkedList(SingleFilePerBlockCache$Entry) At SingleFilePerBlockCache.java:322 of value previously dereferenced in org.apache.hadoop.fs.impl.prefetch.SingleFilePerBlockCache.addToHeadOfLinkedList(SingleFilePerBlockCache$Entry) At SingleFilePerBlockCache.java:[line 307]
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/2/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e5f152550d4e 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 96b483c
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/2/testReport/
Max. process+thread count 3011 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 17m 43s Maven dependency ordering for branch
+1 💚 mvninstall 20m 31s trunk passed
+1 💚 compile 17m 14s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 16m 16s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 26s trunk passed
+1 💚 mvnsite 2m 52s trunk passed
+1 💚 javadoc 2m 8s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 52s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 11s trunk passed
+1 💚 shadedclient 22m 37s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for patch
+1 💚 mvninstall 1m 30s the patch passed
+1 💚 compile 16m 21s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 21s the patch passed
+1 💚 compile 16m 17s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 17s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 15s the patch passed
+1 💚 mvnsite 2m 50s the patch passed
+1 💚 javadoc 2m 3s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 56s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 19s the patch passed
+1 💚 shadedclient 23m 16s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 12s hadoop-common in the patch passed.
+1 💚 unit 2m 52s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 16s The patch does not generate ASF License warnings.
213m 51s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/3/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 31ca6914db70 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 108cb96
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/3/testReport/
Max. process+thread count 1263 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani virajjasani requested a review from mehakmeet June 27, 2023 16:28
Copy link
Contributor
@mehakmeet mehakmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Some changes in the test. What would really be good is to Test this in a multi-threaded test too, nothing major but creating an executor Service with 5 threads then trying to read and evict 3 blocks and asserting the correct IOStats and read values should be enough. This would test the locking as well.


@Override
public Configuration createConfiguration() {
Configuration conf = super.createConfiguration();
S3ATestUtils.removeBaseAndBucketOverrides(conf, PREFETCH_ENABLED_KEY);
conf.setBoolean(PREFETCH_ENABLED_KEY, true);
conf.setInt(FS_PREFETCH_MAX_BLOCKS_COUNT, PREFETCH_MAX_NUM_BLOCKS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Remove base and bucket config for this property in L86, just so that test is consistent in diff env.

// backward seek, can't use block 0 as it is evicted
in.seek(S_1K * 5);
in.read();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert the number of evictions being done or blocks being removed from the list. At certain points, test what the capacity of the list is to keep it consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the num of assertions, i was thinking of adding metric as next sub-task so that this patch doesn't become too complicated to review. is that fine with you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should be fine as well, I saw this method, so thought we were already recording that metric blockRemovedFromFileCache()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, shall we introduce new metric that would help differentiate blockRemovedFromFileCache() vs blockEvictedFromFileCache()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, why not, it'll be good for debugging purposes if there's any difference between them we would know that there's some issue with the proper deletion of the files from cache. Although an overkill but never hurts 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds great, let me create follow-up sub-task for introducing the metric, and update the test with the sub-task.
this will likely keep the commit history clean and easy to manage :)

thanks you once again!

ioStats = in.getIOStatistics();

byte[] buffer = new byte[blockSize];

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment here explaining what the test is doing on a high level, so that it's easier to understand the flow of how LRU is happening.

// Seek to block 3 and don't read completely
in.seek(blockSize * 3L);
in.read(buffer, 0, 2 * S_1K);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a seek to the intersection point of two blocks(eg: 4*blockSize - 10KB) and read some bytes(>10KBs) to read both blocks and assert if the head of the list is the correct block.

@virajjasani
Copy link
Contributor Author

Looks good. Some changes in the test. What would really be good is to Test this in a multi-threaded test too, nothing major but creating an executor Service with 5 threads then trying to read and evict 3 blocks and asserting the correct IOStats and read values should be enough. This would test the locking as well.

thanks for the review @mehakmeet
updated test to perform seeks and reads in parallel using thread pool.

so we have two follow-up sub-tasks to be created later:

  1. refactor Entry class to its own independent class and create UT to test LRU cache head and tail pointers
  2. add blockEvictedFromFileCache() metric

does this sound good?

i will re-run the whole test-suite against us-west-2

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 17m 50s Maven dependency ordering for branch
+1 💚 mvninstall 33m 4s trunk passed
+1 💚 compile 18m 40s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 17m 28s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 30s trunk passed
+1 💚 mvnsite 2m 48s BE96 trunk passed
+1 💚 javadoc 1m 55s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 43s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 22s trunk passed
+1 💚 shadedclient 35m 35s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 31s the patch passed
+1 💚 compile 17m 49s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 17m 49s the patch passed
+1 💚 compile 17m 25s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 17m 25s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 checkstyle 4m 37s the patch passed
+1 💚 mvnsite 2m 40s the patch passed
+1 💚 javadoc 1m 54s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 44s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 41s the patch passed
+1 💚 shadedclient 36m 2s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 29s hadoop-common in the patch passed.
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 7s The patch does not generate ASF License warnings.
257m 18s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/4/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 3e9ea299b868 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ab42829
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/4/testReport/
Max. process+thread count 2148 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 46s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 16m 15s Maven dependency ordering for branch
+1 💚 mvninstall 32m 20s trunk passed
+1 💚 compile 18m 51s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 17m 6s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 5m 2s trunk passed
+1 💚 mvnsite 2m 35s trunk passed
+1 💚 javadoc 1m 59s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 43s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 58s trunk passed
+1 💚 shadedclient 34m 25s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 30s Maven dependency ordering for patch
+1 💚 mvninstall 1m 56s the patch passed
+1 💚 compile 18m 8s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 18m 8s the patch passed
+1 💚 compile 16m 44s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 44s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 checkstyle 5m 1s the patch passed
+1 💚 mvnsite 2m 53s the patch passed
+1 💚 javadoc 1m 56s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 44s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 19s the patch passed
+1 💚 shadedclient 34m 31s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 39s hadoop-common in the patch passed.
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 15s The patch does not generate ASF License warnings.
253m 17s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/5/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 12427714771b 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 87ea604
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/5/testReport/
Max. process+thread count 1263 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor
@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key change: pass size down directly; allows for an fs.s3a option which can then be set on a per-basis

tests: I want to see this working with a block size of 1; head==tail and adding a new block replaces the entire list.

this is probably the size I'd recommend once we add vector IO support. one block to store the orc/parquet footer with all other reads done in the vector API, which (probably) don't need caching

/**
* Constants used by prefetch implementation F438 s.
*/
public final class Constants {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a different name please; too confusing

}

/**
* Prefetch max blocks count config.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add {@value} here and below so IDE popups show the value

/**
* Prefetch max blocks count config.
*/
public static final String FS_PREFETCH_MAX_BLOCKS_COUNT = "fs.prefetch.max.blocks.count";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use fs.s3a here for per bucket settings.

how about just pass in the count as a parameter, rather than a full Configuration object?

try {
in.read(buffer, 0, blockSize - S_1K * 10);
} catch (IOException e) {
throw new RuntimeException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be an UncheckedIOException, but why not just return true from each closure so they become Callable rather than Runnable, so can throw exceptions

@mukund-thakur
Copy link
Contributor

Can we not use some already inbuilt cache rather than us implementing it from scratch ( this part is interesting for sure :))
https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java
guava cache supports maximumSize for eviction and it internally uses LRU. ( see the Java docs of the class. )

@virajjasani
Copy link
Contributor Author

Can we not use some already inbuilt cache rather than us implementing it from scratch ( this part is interesting for sure :)) https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java guava cache supports maximumSize for eviction and it internally uses LRU. ( see the Java docs of the class. )

Interesting,

from the javadoc:

   * The page replacement algorithm's data structures are kept casually consistent with the map. The
   * ordering of writes to a segment is sequentially consistent. An update to the map and recording
   * of reads may not be immediately reflected on the algorithm's data structures.

for not-so-heavy loaded cache, perhaps our own implementation might be better? given that even reads would have immediate reflection on the doubly linked list data structure in our case.
let me explore a bit more though. thanks for the reference!

@virajjasani
Copy link
Contributor Author

tests: I want to see this working with a block size of 1; head==tail and adding a new block replaces the entire list.

this will need new test class because if we change block size config to 1 for ITestS3APrefetchingInputStream, rest of the tests fail since we do lot of IOStats assertions with num of blocks prefetched etc

@virajjasani
Copy link
Contributor Author
virajjasani commented Jul 1, 2023

@mukund-thakur i tried using guava LoadingCache, it's not consistently able to evict cache entries, it's doing asynchronously (but it takes very long time to evict entries) with weak ref and hence leading to inconsistent num of entries.

for instance, even when i set max size as 1, i can see 8 entries in the map for more than 15s. hence, maintaining consistency with concurrency seems really problematic with this implementation.
there is option to set concurrency too, but still somehow eviction is not frequent enough, i suspect this might be because of this:

An update to the map and recording
of reads may not be immediately reflected on the algorithm's data structures.

the patch i tried:

  /**
   * Blocks stored in this cache.
   */
  private final LoadingCache<Integer, Entry> blocks;
...
...
...

  public SingleFilePerBlockCache(PrefetchingStatistics prefetchingStatistics, int maxBlocksCount) {
    this.prefetchingStatistics = requireNonNull(prefetchingStatistics);
    this.closed = new AtomicBoolean(false);
    this.maxBlocksCount = maxBlocksCount;
    Preconditions.checkArgument(maxBlocksCount > 0, "maxBlocksCount should be more than 0");
    blocks = CacheBuilder.newBuilder().concurrencyLevel(1).maximumSize(maxBlocksCount)
        .build(new CacheLoader<Integer, Entry>() {
          @Override
          public Entry load(Integer key) throws Exception {
            return null;
          }
        });
    blocksLock = new ReentrantReadWriteLock();
  }

Even with concurrencyLevel(1), we don't get strong consistency. The evictions are not taking place even after 40s+ wait :(

Edit:

https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java

i tried LoadingCache because it is for public usage, whereas LocalCache is for guava's internal impl.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 16m 34s Maven dependency ordering for branch
+1 💚 mvninstall 31m 31s trunk passed
+1 💚 compile 17m 19s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 16m 8s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 27s trunk passed
+1 💚 mvnsite 2m 50s trunk passed
+1 💚 javadoc 2m 11s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 56s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 5s trunk passed
+1 💚 shadedclient 34m 26s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 34m 55s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for patch
+1 💚 mvninstall 1m 28s the patch passed
+1 💚 compile 16m 11s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 11s the patch passed
+1 💚 compile 16m 14s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 14s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 17s the patch passed
+1 💚 mvnsite 2m 47s the patch passed
+1 💚 javadoc 2m 6s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 56s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 23s the patch passed
+1 💚 shadedclient 34m 45s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 11s hadoop-common in the patch passed.
+1 💚 unit 2m 52s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 13s The patch does not generate ASF License warnings.
246m 48s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/6/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux a7e8d6fb0e5a 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c72a556
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/6/testReport/
Max. process+thread count 1263 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 18m 21s Maven dependency ordering for branch
+1 💚 mvninstall 31m 10s trunk passed
+1 💚 compile 17m 21s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 16m 8s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 27s trunk passed
+1 💚 mvnsite 2m 50s trunk passed
+1 💚 javadoc 2m 10s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 56s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 4s trunk passed
+1 💚 shadedclient 34m 51s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 35m 21s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 28s the patch passed
+1 💚 compile 16m 39s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 39s the patch passed
+1 💚 compile 16m 8s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 17s the patch passed
+1 💚 mvnsite 2m 43s the patch passed
+1 💚 javadoc 2m 5s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 55s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 20s the patch passed
+1 💚 shadedclient 35m 14s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 19s hadoop-common in the patch passed.
+1 💚 unit 2m 51s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 15s The patch does not generate ASF License warnings.
249m 32s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/7/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 5ae178d25df9 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2006290
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/7/testReport/
Max. process+thread count 1263 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani virajjasani requested a review from mehakmeet July 3, 2023 20:29
@virajjasani
Copy link
Contributor Author
virajjasani commented Jul 5, 2023

@mehakmeet @mukund-thakur @steveloughran
could you please review the latest revision? i have done multiple rounds of testing and run the whole suite with -scale too.

Copy link
Contributor
@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readFile() needs to always close the file channel, even of a read fails. (not caused by thos patch, but extant).
same for writeFile()

@virajjasani
Copy link
Contributor Author

readFile() needs to always close the file channel, even of a read fails.

sounds good for both read and write, will file a jira unless already filed.

Copy link
Contributor
@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy with production code (though worried about that default); test changes though

* Default value for max blocks count config.
* Value = {@value DEFAULT_PREFETCH_MAX_BLOCKS_COUNT}
*/
public static final int DEFAULT_PREFETCH_MAX_BLOCKS_COUNT = 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is per stream? it's big. we may want to see what happens in apps with many threads -risk of out of disk is high.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps 4 is enough?


@Override
public Configuration createConfiguration() {
Configuration conf = super.createConfiguration();
S3ATestUtils.removeBaseAndBucketOverrides(conf, PREFETCH_ENABLED_KEY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be restored, and add PREFETCH_MAX_BLOCKS_COUNT, as another to cut.

in all Itests, assume that any option set in `createConfiguration()' MAY have a per-bucket override set by someone, so you MUST explicitly remove it.

byte[] buffer = new byte[blockSize];
// Don't read block 0 completely
try {
in.read(buffer, 0, blockSize - S_1K * 10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use readFully(); always.

byte[] buffer = new byte[blockSize];
// Seek to block 1 and don't read completely
try {
in.seek(blockSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use PositionedReadable to include it in the test coverage

readFully(blockSize * 4L, buffer, 0, 2 * S_1K);

// and let LRU eviction take place as more cache entries
// are added with multiple block reads.

executorService.submit(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these submitted closures are almost identical. why not define one for readFully(), one for PositionedRedable.readFully() and submit them; maybe as some function

Callable<Boolean> readFully(CountDownLatch countDownLatch, boolean positionedReadable, long offset, long size) {
  // return one of these with seek(offset) and read to a buffer of [size]
}
then submit these

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 36s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 15m 51s Maven dependency ordering for branch
+1 💚 mvninstall 31m 32s trunk passed
+1 💚 compile 17m 30s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 16m 12s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 13s trunk passed
+1 💚 mvnsite 2m 41s trunk passed
+1 💚 javadoc 1m 59s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 47s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 12s trunk passed
+1 💚 shadedclient 37m 28s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 37m 58s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for patch
+1 💚 mvninstall 1m 29s the patch passed
+1 💚 compile 16m 21s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 21s the patch passed
+1 💚 compile 16m 16s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 16s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 20s the patch passed
+1 💚 mvnsite 2m 51s the patch passed
+1 💚 javadoc 2m 4s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 54s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 20s the patch passed
+1 💚 shadedclient 34m 48s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 18s hadoop-common in the patch passed.
-1 ❌ unit 8m 6s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch passed.
+1 💚 asflicense 1m 14s The patch does not generate ASF License warnings.
254m 54s
Reason Tests
Failed junit tests hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/10/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 786bfee867de 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 709163c
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/10/testReport/
Max. process+thread count 1302 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/10/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 48s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 15m 55s Maven dependency ordering for branch
+1 💚 mvninstall 38m 5s trunk passed
+1 💚 compile 18m 45s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 17m 49s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 50s trunk passed
+1 💚 mvnsite 2m 35s trunk passed
+1 💚 javadoc 1m 46s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 32s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 49s trunk passed
+1 💚 shadedclient 38m 49s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 39m 15s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 28s Maven dependency ordering for patch
+1 💚 mvninstall 1m 28s the patch passed
+1 💚 compile 17m 46s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 17m 46s the patch passed
+1 💚 compile 17m 1s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 17m 1s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 40s the patch passed
+1 💚 mvnsite 2m 29s the patch passed
+1 💚 javadoc 1m 40s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 32s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 7s the patch passed
+1 💚 shadedclient 39m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 0s hadoop-common in the patch passed.
-1 ❌ unit 7m 4s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch passed.
+1 💚 asflicense 0m 59s The patch does not generate ASF License warnings.
268m 46s
Reason Tests
Failed junit tests hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/9/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux abe9c31f0de3 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 709163c
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/9/testReport/
Max. process+thread count 2497 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/9/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 16m 12s Maven dependency ordering for branch
+1 💚 mvninstall 38m 2s trunk passed
+1 💚 compile 19m 19s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 18m 55s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 5m 46s trunk passed
+1 💚 mvnsite 2m 52s trunk passed
+1 💚 javadoc 2m 0s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 48s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 13s trunk passed
+1 💚 shadedclient 40m 52s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 41m 20s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 28s the patch passed
+1 💚 compile 16m 21s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 21s the patch passed
+1 💚 compile 16m 21s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 21s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 23s the patch passed
+1 💚 mvnsite 2m 47s the patch passed
+1 💚 javadoc 2m 5s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 56s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 22s the patch passed
+1 💚 shadedclient 35m 34s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 12s hadoop-common in the patch passed.
-1 ❌ unit 8m 50s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch passed.
+1 💚 asflicense 1m 14s The patch does not generate ASF License warnings.
273m 23s
Reason Tests
Failed junit tests hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/8/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e55254152ab5 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 709163c
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/8/testReport/
Max. process+thread count 1263 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor
@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes are good, though test failures are due to DEFAULT_PREFETCH_MAX_BLOCKS_COUNT being < the #of blocks fetched; the tests. they will need to assert that the number of cached blocks matches min(default, requested blocks)

java.lang.IllegalStateException: waitForCaching: expected: 8, actual: 4, read errors: 0, caching errors: 0
	at org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.waitForCaching(TestS3ACachingBlockManager.java:360)
	at org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfPrefetched(TestS3ACachingBlockManager.java:288)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:750)

@virajjasani
Copy link
Contributor Author

re-running the whole test suite with -prefetch and -scale combinations

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 16m 28s Maven dependency ordering for branch
+1 💚 mvninstall 31m 7s trunk passed
+1 💚 compile 17m 24s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 16m 19s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 23s trunk passed
+1 💚 mvnsite 2m 49s trunk passed
+1 💚 javadoc 2m 11s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 55s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 4s trunk passed
+1 💚 shadedclient 34m 22s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 34m 52s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 26s the patch passed
+1 💚 compile 16m 18s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 16m 18s the patch passed
+1 💚 compile 16m 20s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 16m 20s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 16s the patch passed
+1 💚 mvnsite 2m 50s the patch passed
+1 💚 javadoc 2m 5s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 1m 57s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 4m 22s the patch passed
+1 💚 shadedclient 35m 13s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 11s hadoop-common in the patch passed.
+1 💚 unit 2m 56s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 11s The patch does not generate ASF License warnings.
247m 23s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/11/artifact/out/Dockerfile
GITHUB PR #5754
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 749d07bca684 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 796a3d4
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/11/testReport/
Max. process+thread count 2330 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/11/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

us-west-2:

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale

test results look good

Copy link
Contributor
@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@steveloughran steveloughran merged commit e7d74f3 into apache:trunk Jul 14, 2023
@steveloughran
Copy link
Contributor

merged, though now i'm using it that new test is way too slow. in my rebased unbuffered pr I have moved it to -Dscale, but really we can just set the block size down to something minimal and then work with a small file

asfgit pushed a commit that referenced this pull request Jul 14, 2023
…leFilePerBlockCache (#5754)

Contributed by Viraj Jasani
@virajjasani
Copy link
Contributor Author

created addendum PR for dealing with small file #5843

jiajunmao pushed a commit to jiajunmao/hadoop-MLEC that referenced this pull request Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0