HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for SingleFilePerBlockCache #5754

virajjasani · 2023-06-17T06:59:45Z

…leFilePerBlockCache

virajjasani · 2023-06-17T07:56:01Z

us-west-2:

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale

hadoop-yetus · 2023-06-17T10:40:23Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 39s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	18m 1s		Maven dependency ordering for branch
+1 💚	mvninstall	22m 54s		trunk passed
+1 💚	compile	18m 37s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	16m 16s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 25s		trunk passed
+1 💚	mvnsite	2m 47s		trunk passed
+1 💚	javadoc	2m 12s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 56s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 17s		trunk passed
+1 💚	shadedclient	22m 41s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 33s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 29s		the patch passed
+1 💚	compile	16m 29s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 29s		the patch passed
+1 💚	compile	16m 21s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 21s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 19s		the patch passed
+1 💚	mvnsite	2m 48s		the patch passed
+1 💚	javadoc	2m 5s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 56s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 21s		the patch passed
+1 💚	shadedclient	22m 55s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 15s		hadoop-common in the patch passed.
+1 💚	unit	2m 52s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 17s		The patch does not generate ASF License warnings.
		218m 36s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/1/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux e294eb7de530 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `282ddd1`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/1/testReport/
Max. process+thread count	2580 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

virajjasani · 2023-06-22T22:04:19Z

@mukund-thakur @mehakmeet could you please review this PR?

mehakmeet

Did an initial review at first glance. Looks good. Need to properly understand the locking and if there are any scenarios we need to think of and would review the tests next.

mehakmeet · 2023-06-23T11:02:34Z

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

+  private Entry head;
+
+  /**
+   * Tail of the lined list.


typo: "linked"

mehakmeet · 2023-06-23T11:05:22Z

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

+  /**
+   * Prefetch max blocks count config.
+   */
+  public static final String FS_PREFETCH_MAX_BLOCKS_COUNT = "fs.prefetch.max.blocks.count";


Is there a Constants class where we can move this to?

mehakmeet · 2023-06-23T11:07:39Z

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

+    this.maxBlocksCount =
+        conf.getInt(FS_PREFETCH_MAX_BLOCKS_COUNT, DEFAULT_FS_PREFETCH_MAX_BLOCKS_COUNT);
+    Preconditions.checkArgument(this.maxBlocksCount > 0,
+        "prefetch blocks total capacity should be more than 0");


Include the property name in the error message by which we can set this to a valid value

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

mehakmeet · 2023-06-23T11:42:30Z

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

+      if (tail != null) {
+        while (tail.getNext() != null) {
+          tail = tail.getNext();
+        }
+      }


Can you explain a bit about this part, not able to get why this is needed?

sure, let's say:

head -> 1, tail -> 2

new block: 3
so, we need to make: 3 -> 1 -> 2
i.e. new head -> 3, tail -> 2

new block: 4
updated list: 4 -> 3 -> 1 -> 2

now let's say input stream accesses block 2, hence block 2 needs to be put at the head.
new list should be: 2 -> 4 -> 3 -> 1

we change head to 2 and we also update next pointer of block 1
however, if we don't update tail (L322-L326), we will not be able to move tail to block 1.

Okay, but when we are adding the node to the head, doesn't it make more sense to check if the current node is tail, get the previous of this, and set that to tail? This would work, was just interested if we can avoid traversing the list 🤔

yes, nice optimization for sure, let me double check this again, i ran through multiple test iterations and somehow found that this works for sure but let me check if the optimization works (i think it should but i am just wondering if i am missing some cases).

Thank you btw @mehakmeet !!

i applied this patch temporarily to debug further but somehow head and tail are getting screwed up:

diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java index ef685b54d30..1aad82ff9c2 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java @@ -306,6 +306,7 @@ private void addToHeadOfLinkedList(Entry entry) { "Block num {} to be added to the head. Current head block num: {} and tail block num: {}", entry.blockNumber, head.blockNumber, tail.blockNumber); if (entry != head) { + boolean isEntryTail = entry == tail; Entry prev = entry.getPrevious(); Entry nxt = entry.getNext(); if (prev != null) { @@ -318,10 +319,8 @@ private void addToHeadOfLinkedList(Entry entry) { entry.setNext(head); head.setPrevious(entry); head = entry; - } - if (tail != null) { - while (tail.getNext() != null) { - tail = tail.getNext(); + if (isEntryTail) { + tail = prev; } } } finally {

however, somehow after eviction, the head and tail are getting screwed up, still trying to understand what is going wrong and why this patch would not work.
but i hope you were suggestion change like this one, correct?

it works with slight modification, let me push the change

Yes, happy with the new change I think. Would be good to explicitly test certain tail changing scenarios in the IT like you mentioned above if we are not already doing it.

thanks, i was actually thinking about it but there is just not clean way of doing so, hence what i have been able to do so far was by "logging" head and tail nodes (as you also mentioned earlier) with all other nodes, so that i could track the exact nodes sequence. that's the best way i could find so far, but extracting that info in IT is really difficult (if we were to do it in clean way).

True, I was thinking, would it be possible via a simple UT as well, where we pass in the entries as we desire and access them in our preferences to test functionality, might be easier if we directly test the LRU logic than via the stream.

nice idea, it might be more beneficial for UT to test this.

i am also planning to refactor Entry class on it's own new class rather than as an inner class of SingleFilePerBlockCache as part of next follow-up sub-task. once we do that, then it might be even more easier to write some UT to directly access head, tail pointers.

sorry, i was thinking this as sub-task so maybe adding UT can also be done with sub-task. does that sound good?

mehakmeet · 2023-06-23T12:14:44Z

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

+        elementToPurge.takeLock(Entry.LockType.WRITE, PREFETCH_WRITE_LOCK_TIMEOUT,
+            PREFETCH_WRITE_LOCK_TIMEOUT_UNIT);
+    if (!lockAcquired) {
+      LOG.error("Cache file {} deletion would not be attempted as write lock could not"


So, there can be a scenario where the current cache exceeds its normal capacity? Is 5 seconds enough time? or are we okay with this?

since we are already using 5s at other place also (PREFETCH_WRITE_LOCK_TIMEOUT), used it here as well but happy to change it in future as/if we encounter some problem with this, does that sound good?

It seems like we are okay with things not blowing up if eviction is not successful, are we okay with it? Can this hurt in the long run?

it should be okay, in fact we have same logic for input stream close as well, if eviction or removal of disk block is unsuccessful, we are leaving them with a fat warning.

if eviction misses it, stream close would be able to clean it up.
if stream close misses it, then it stays on disk and we might eventually also come up with some "file last accessed" based check and maybe some crons removing them eventually. not a bad idea IMO.

Okay, sounds good

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

hadoop-yetus · 2023-06-24T10:25:47Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 42s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	17m 53s		Maven dependency ordering for branch
+1 💚	mvninstall	20m 55s		trunk passed
+1 💚	compile	17m 15s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	16m 10s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 27s		trunk passed
+1 💚	mvnsite	2m 52s		trunk passed
+1 💚	javadoc	2m 15s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 54s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 13s		trunk passed
+1 💚	shadedclient	22m 51s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 33s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 27s		the patch passed
+1 💚	compile	16m 9s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 9s		the patch passed
+1 💚	compile	16m 7s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 7s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 21s		the patch passed
+1 💚	mvnsite	2m 43s		the patch passed
+1 💚	javadoc	2m 5s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 55s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
-1 ❌	spotbugs	2m 52s	/new-spotbugs-hadoop-common-project_hadoop-common.html	hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚	shadedclient	22m 33s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 11s		hadoop-common in the patch passed.
+1 💚	unit	2m 53s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 16s		The patch does not generate ASF License warnings.
		213m 51s

Reason	Tests
SpotBugs	module:hadoop-common-project/hadoop-common
	Nullcheck of SingleFilePerBlockCache.tail at line 322 of value previously dereferenced in org.apache.hadoop.fs.impl.prefetch.SingleFilePerBlockCache.addToHeadOfLinkedList(SingleFilePerBlockCache$Entry) At SingleFilePerBlockCache.java:322 of value previously dereferenced in org.apache.hadoop.fs.impl.prefetch.SingleFilePerBlockCache.addToHeadOfLinkedList(SingleFilePerBlockCache$Entry) At SingleFilePerBlockCache.java:[line 307]

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/2/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux e5f152550d4e 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `96b483c`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/2/testReport/
Max. process+thread count	3011 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2023-06-24T21:06:40Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 37s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 1s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	17m 43s		Maven dependency ordering for branch
+1 💚	mvninstall	20m 31s		trunk passed
+1 💚	compile	17m 14s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	16m 16s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 26s		trunk passed
+1 💚	mvnsite	2m 52s		trunk passed
+1 💚	javadoc	2m 8s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 52s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 11s		trunk passed
+1 💚	shadedclient	22m 37s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 32s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 30s		the patch passed
+1 💚	compile	16m 21s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 21s		the patch passed
+1 💚	compile	16m 17s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 17s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 15s		the patch passed
+1 💚	mvnsite	2m 50s		the patch passed
+1 💚	javadoc	2m 3s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 56s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 19s		the patch passed
+1 💚	shadedclient	23m 16s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 12s		hadoop-common in the patch passed.
+1 💚	unit	2m 52s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 16s		The patch does not generate ASF License warnings.
		213m 51s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/3/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 31ca6914db70 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `108cb96`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/3/testReport/
Max. process+thread count	1263 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/3/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

mehakmeet

Looks good. Some changes in the test. What would really be good is to Test this in a multi-threaded test too, nothing major but creating an executor Service with 5 threads then trying to read and evict 3 blocks and asserting the correct IOStats and read values should be enough. This would test the locking as well.

mehakmeet · 2023-06-30T05:41:14Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingInputStream.java


  @Override
  public Configuration createConfiguration() {
    Configuration conf = super.createConfiguration();
    S3ATestUtils.removeBaseAndBucketOverrides(conf, PREFETCH_ENABLED_KEY);
    conf.setBoolean(PREFETCH_ENABLED_KEY, true);
+    conf.setInt(FS_PREFETCH_MAX_BLOCKS_COUNT, PREFETCH_MAX_NUM_BLOCKS);


nit: Remove base and bucket config for this property in L86, just so that test is consistent in diff env.

mehakmeet · 2023-06-30T05:48:49Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingInputStream.java

+      // backward seek, can't use block 0 as it is evicted
+      in.seek(S_1K * 5);
+      in.read();
+


Assert the number of evictions being done or blocks being removed from the list. At certain points, test what the capacity of the list is to keep it consistent.

for the num of assertions, i was thinking of adding metric as next sub-task so that this patch doesn't become too complicated to review. is that fine with you?

Yes, that should be fine as well, I saw this method, so thought we were already recording that metric blockRemovedFromFileCache()

got it, shall we introduce new metric that would help differentiate blockRemovedFromFileCache() vs blockEvictedFromFileCache()?

Yea, why not, it'll be good for debugging purposes if there's any difference between them we would know that there's some issue with the proper deletion of the files from cache. Although an overkill but never hurts 😄

sounds great, let me create follow-up sub-task for introducing the metric, and update the test with the sub-task.
this will likely keep the commit history clean and easy to manage :)

thanks you once again!

mehakmeet · 2023-06-30T05:49:51Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingInputStream.java

+      ioStats = in.getIOStatistics();
+
+      byte[] buffer = new byte[blockSize];
+


add a comment here explaining what the test is doing on a high level, so that it's easier to understand the flow of how LRU is happening.

mehakmeet · 2023-06-30T05:55:42Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingInputStream.java

+      // Seek to block 3 and don't read completely
+      in.seek(blockSize * 3L);
+      in.read(buffer, 0, 2 * S_1K);
+


Add a seek to the intersection point of two blocks(eg: 4*blockSize - 10KB) and read some bytes(>10KBs) to read both blocks and assert if the head of the list is the correct block.

virajjasani · 2023-06-30T07:48:00Z

Looks good. Some changes in the test. What would really be good is to Test this in a multi-threaded test too, nothing major but creating an executor Service with 5 threads then trying to read and evict 3 blocks and asserting the correct IOStats and read values should be enough. This would test the locking as well.

thanks for the review @mehakmeet
updated test to perform seeks and reads in parallel using thread pool.

so we have two follow-up sub-tasks to be created later:

refactor Entry class to its own independent class and create UT to test LRU cache head and tail pointers
add blockEvictedFromFileCache() metric

does this sound good?

i will re-run the whole test-suite against us-west-2

hadoop-yetus · 2023-06-30T11:59:25Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 35s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	17m 50s		Maven dependency ordering for branch
+1 💚	mvninstall	33m 4s		trunk passed
+1 💚	compile	18m 40s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	17m 28s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 30s		trunk passed
+1 💚	mvnsite	2m 48s BE96		trunk passed
+1 💚	javadoc	1m 55s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 43s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 22s		trunk passed
+1 💚	shadedclient	35m 35s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 33s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 31s		the patch passed
+1 💚	compile	17m 49s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	17m 49s		the patch passed
+1 💚	compile	17m 25s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	17m 25s		the patch passed
-1 ❌	blanks	0m 0s	/blanks-eol.txt	The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚	checkstyle	4m 37s		the patch passed
+1 💚	mvnsite	2m 40s		the patch passed
+1 💚	javadoc	1m 54s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 44s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 41s		the patch passed
+1 💚	shadedclient	36m 2s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 29s		hadoop-common in the patch passed.
+1 💚	unit	2m 53s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 7s		The patch does not generate ASF License warnings.
		257m 18s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/4/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 3e9ea299b868 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `ab42829`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/4/testReport/
Max. process+thread count	2148 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/4/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2023-06-30T12:04:31Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 46s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	16m 15s		Maven dependency ordering for branch
+1 💚	mvninstall	32m 20s		trunk passed
+1 💚	compile	18m 51s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	17m 6s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	5m 2s		trunk passed
+1 💚	mvnsite	2m 35s		trunk passed
+1 💚	javadoc	1m 59s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 43s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 58s		trunk passed
+1 💚	shadedclient	34m 25s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 30s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 56s		the patch passed
+1 💚	compile	18m 8s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	18m 8s		the patch passed
+1 💚	compile	16m 44s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 44s		the patch passed
-1 ❌	blanks	0m 0s	/blanks-eol.txt	The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚	checkstyle	5m 1s		the patch passed
+1 💚	mvnsite	2m 53s		the patch passed
+1 💚	javadoc	1m 56s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 44s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 19s		the patch passed
+1 💚	shadedclient	34m 31s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 39s		hadoop-common in the patch passed.
+1 💚	unit	2m 53s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 15s		The patch does not generate ASF License warnings.
		253m 17s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/5/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 12427714771b 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `87ea604`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/5/testReport/
Max. process+thread count	1263 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/5/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

steveloughran

key change: pass size down directly; allows for an fs.s3a option which can then be set on a per-basis

tests: I want to see this working with a block size of 1; head==tail and adding a new block replaces the entire list.

this is probably the size I'd recommend once we add vector IO support. one block to store the orc/parquet footer with all other reads done in the vector API, which (probably) don't need caching

steveloughran · 2023-06-30T14:26:16Z

...common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/Constants.java

+/**
+ * Constants used by prefetch implementation
F438
s.
+ */
+public final class Constants {


can we have a different name please; too confusing

steveloughran · 2023-06-30T14:26:44Z

...common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/Constants.java

+  }
+
+  /**
+   * Prefetch max blocks count config.


add {@value} here and below so IDE popups show the value

steveloughran · 2023-06-30T14:28:37Z

...common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/Constants.java

+  /**
+   * Prefetch max blocks count config.
+   */
+  public static final String FS_PREFETCH_MAX_BLOCKS_COUNT = "fs.prefetch.max.blocks.count";


I think we should use fs.s3a here for per bucket settings.

how about just pass in the count as a parameter, rather than a full Configuration object?

.../hadoop-common/src/main/java/org/apache/hadoop/fs/impl/prefetch/SingleFilePerBlockCache.java

steveloughran · 2023-06-30T14:42:07Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingInputStream.java

+        try {
+          in.read(buffer, 0, blockSize - S_1K * 10);
+        } catch (IOException e) {
+          throw new RuntimeException(e);


it should be an UncheckedIOException, but why not just return true from each closure so they become Callable rather than Runnable, so can throw exceptions

mukund-thakur · 2023-06-30T21:39:37Z

Can we not use some already inbuilt cache rather than us implementing it from scratch ( this part is interesting for sure :))
https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java
guava cache supports maximumSize for eviction and it internally uses LRU. ( see the Java docs of the class. )

virajjasani · 2023-06-30T22:57:09Z

Can we not use some already inbuilt cache rather than us implementing it from scratch ( this part is interesting for sure :)) https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java guava cache supports maximumSize for eviction and it internally uses LRU. ( see the Java docs of the class. )

Interesting,

from the javadoc:

   * The page replacement algorithm's data structures are kept casually consistent with the map. The
   * ordering of writes to a segment is sequentially consistent. An update to the map and recording
   * of reads may not be immediately reflected on the algorithm's data structures.

for not-so-heavy loaded cache, perhaps our own implementation might be better? given that even reads would have immediate reflection on the doubly linked list data structure in our case.
let me explore a bit more though. thanks for the reference!

virajjasani · 2023-06-30T23:15:55Z

tests: I want to see this working with a block size of 1; head==tail and adding a new block replaces the entire list.

this will need new test class because if we change block size config to 1 for ITestS3APrefetchingInputStream, rest of the tests fail since we do lot of IOStats assertions with num of blocks prefetched etc

virajjasani · 2023-07-01T00:35:20Z

@mukund-thakur i tried using guava LoadingCache, it's not consistently able to evict cache entries, it's doing asynchronously (but it takes very long time to evict entries) with weak ref and hence leading to inconsistent num of entries.

for instance, even when i set max size as 1, i can see 8 entries in the map for more than 15s. hence, maintaining consistency with concurrency seems really problematic with this implementation.
there is option to set concurrency too, but still somehow eviction is not frequent enough, i suspect this might be because of this:

An update to the map and recording
of reads may not be immediately reflected on the algorithm's data structures.

the patch i tried:

  /**
   * Blocks stored in this cache.
   */
  private final LoadingCache<Integer, Entry> blocks;
...
...
...

  public SingleFilePerBlockCache(PrefetchingStatistics prefetchingStatistics, int maxBlocksCount) {
    this.prefetchingStatistics = requireNonNull(prefetchingStatistics);
    this.closed = new AtomicBoolean(false);
    this.maxBlocksCount = maxBlocksCount;
    Preconditions.checkArgument(maxBlocksCount > 0, "maxBlocksCount should be more than 0");
    blocks = CacheBuilder.newBuilder().concurrencyLevel(1).maximumSize(maxBlocksCount)
        .build(new CacheLoader<Integer, Entry>() {
          @Override
          public Entry load(Integer key) throws Exception {
            return null;
          }
        });
    blocksLock = new ReentrantReadWriteLock();
  }

Even with concurrencyLevel(1), we don't get strong consistency. The evictions are not taking place even after 40s+ wait :(

Edit:

https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java

i tried LoadingCache because it is for public usage, whereas LocalCache is for guava's internal impl.

hadoop-yetus · 2023-07-01T06:18:36Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 38s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 4 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	16m 34s		Maven dependency ordering for branch
+1 💚	mvninstall	31m 31s		trunk passed
+1 💚	compile	17m 19s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	16m 8s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 27s		trunk passed
+1 💚	mvnsite	2m 50s		trunk passed
+1 💚	javadoc	2m 11s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 56s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 5s		trunk passed
+1 💚	shadedclient	34m 26s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	34m 55s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 32s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 28s		the patch passed
+1 💚	compile	16m 11s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 11s		the patch passed
+1 💚	compile	16m 14s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 14s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 17s		the patch passed
+1 💚	mvnsite	2m 47s		the patch passed
+1 💚	javadoc	2m 6s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 56s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 23s		the patch passed
+1 💚	shadedclient	34m 45s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 11s		hadoop-common in the patch passed.
+1 💚	unit	2m 52s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 13s		The patch does not generate ASF License warnings.
		246m 48s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/6/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux a7e8d6fb0e5a 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `c72a556`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/6/testReport/
Max. process+thread count	1263 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/6/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2023-07-02T21:08:03Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 37s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 4 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	18m 21s		Maven dependency ordering for branch
+1 💚	mvninstall	31m 10s		trunk passed
+1 💚	compile	17m 21s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	16m 8s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 27s		trunk passed
+1 💚	mvnsite	2m 50s		trunk passed
+1 💚	javadoc	2m 10s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 56s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 4s		trunk passed
+1 💚	shadedclient	34m 51s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	35m 21s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 33s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 28s		the patch passed
+1 💚	compile	16m 39s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 39s		the patch passed
+1 💚	compile	16m 8s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 8s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 17s		the patch passed
+1 💚	mvnsite	2m 43s		the patch passed
+1 💚	javadoc	2m 5s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 55s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 20s		the patch passed
+1 💚	shadedclient	35m 14s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 19s		hadoop-common in the patch passed.
+1 💚	unit	2m 51s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 15s		The patch does not generate ASF License warnings.
		249m 32s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/7/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 5ae178d25df9 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `2006290`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/7/testReport/
Max. process+thread count	1263 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/7/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingLruEviction.java

virajjasani · 2023-07-05T19:30:36Z

@mehakmeet @mukund-thakur @steveloughran
could you please review the latest revision? i have done multiple rounds of testing and run the whole suite with -scale too.

steveloughran

readFile() needs to always close the file channel, even of a read fails. (not caused by thos patch, but extant).
same for writeFile()

virajjasani · 2023-07-11T15:40:09Z

readFile() needs to always close the file channel, even of a read fails.

sounds good for both read and write, will file a jira unless already filed.

steveloughran

happy with production code (though worried about that default); test changes though

steveloughran · 2023-07-12T13:11:54Z

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java

+   * Default value for max blocks count config.
+   * Value = {@value DEFAULT_PREFETCH_MAX_BLOCKS_COUNT}
+   */
+  public static final int DEFAULT_PREFETCH_MAX_BLOCKS_COUNT = 10;


this is per stream? it's big. we may want to see what happens in apps with many threads -risk of out of disk is high.

perhaps 4 is enough?

steveloughran · 2023-07-12T13:13:54Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingInputStream.java


  @Override
  public Configuration createConfiguration() {
    Configuration conf = super.createConfiguration();
-    S3ATestUtils.removeBaseAndBucketOverrides(conf, PREFETCH_ENABLED_KEY);


this needs to be restored, and add PREFETCH_MAX_BLOCKS_COUNT, as another to cut.

in all Itests, assume that any option set in `createConfiguration()' MAY have a per-bucket override set by someone, so you MUST explicitly remove it.

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingLruEviction.java

steveloughran · 2023-07-12T13:23:37Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingLruEviction.java

+        byte[] buffer = new byte[blockSize];
+        // Don't read block 0 completely
+        try {
+          in.read(buffer, 0, blockSize - S_1K * 10);


use readFully(); always.

steveloughran · 2023-07-12T13:23:58Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingLruEviction.java

+        byte[] buffer = new byte[blockSize];
+        // Seek to block 1 and don't read completely
+        try {
+          in.seek(blockSize);


use PositionedReadable to include it in the test coverage

readFully(blockSize * 4L, buffer, 0, 2 * S_1K);

steveloughran · 2023-07-12T13:28:30Z

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingLruEviction.java

+      // and let LRU eviction take place as more cache entries
+      // are added with multiple block reads.
+
+      executorService.submit(() -> {


these submitted closures are almost identical. why not define one for readFully(), one for PositionedRedable.readFully() and submit them; maybe as some function

Callable<Boolean> readFully(CountDownLatch countDownLatch, boolean positionedReadable, long offset, long size) { // return one of these with seek(offset) and read to a buffer of [size] } then submit these

hadoop-yetus · 2023-07-13T10:32:36Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 36s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 1s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	15m 51s		Maven dependency ordering for branch
+1 💚	mvninstall	31m 32s		trunk passed
+1 💚	compile	17m 30s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	16m 12s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 13s		trunk passed
+1 💚	mvnsite	2m 41s		trunk passed
+1 💚	javadoc	1m 59s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 47s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 12s		trunk passed
+1 💚	shadedclient	37m 28s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	37m 58s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 34s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 29s		the patch passed
+1 💚	compile	16m 21s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 21s		the patch passed
+1 💚	compile	16m 16s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 16s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 20s		the patch passed
+1 💚	mvnsite	2m 51s		the patch passed
+1 💚	javadoc	2m 4s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 54s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 20s		the patch passed
+1 💚	shadedclient	34m 48s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 18s		hadoop-common in the patch passed.
-1 ❌	unit	8m 6s	/patch-unit-hadoop-tools_hadoop-aws.txt	hadoop-aws in the patch passed.
+1 💚	asflicense	1m 14s		The patch does not generate ASF License warnings.
		254m 54s

Reason	Tests
Failed junit tests	hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/10/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 786bfee867de 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `709163c`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/10/testReport/
Max. process+thread count	1302 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/10/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2023-07-13T10:42:25Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 48s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	15m 55s		Maven dependency ordering for branch
+1 💚	mvninstall	38m 5s		trunk passed
+1 💚	compile	18m 45s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	17m 49s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 50s		trunk passed
+1 💚	mvnsite	2m 35s		trunk passed
+1 💚	javadoc	1m 46s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 32s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	3m 49s		trunk passed
+1 💚	shadedclient	38m 49s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	39m 15s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 28s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 28s		the patch passed
+1 💚	compile	17m 46s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	17m 46s		the patch passed
+1 💚	compile	17m 1s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	17m 1s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 40s		the patch passed
+1 💚	mvnsite	2m 29s		the patch passed
+1 💚	javadoc	1m 40s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 32s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 7s		the patch passed
+1 💚	shadedclient	39m 28s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 0s		hadoop-common in the patch passed.
-1 ❌	unit	7m 4s	/patch-unit-hadoop-tools_hadoop-aws.txt	hadoop-aws in the patch passed.
+1 💚	asflicense	0m 59s		The patch does not generate ASF License warnings.
		268m 46s

Reason	Tests
Failed junit tests	hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/9/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux abe9c31f0de3 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `709163c`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/9/testReport/
Max. process+thread count	2497 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/9/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2023-07-13T10:46:51Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 37s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	16m 12s		Maven dependency ordering for branch
+1 💚	mvninstall	38m 2s		trunk passed
+1 💚	compile	19m 19s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	18m 55s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	5m 46s		trunk passed
+1 💚	mvnsite	2m 52s		trunk passed
+1 💚	javadoc	2m 0s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 48s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 13s		trunk passed
+1 💚	shadedclient	40m 52s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	41m 20s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 33s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 28s		the patch passed
+1 💚	compile	16m 21s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 21s		the patch passed
+1 💚	compile	16m 21s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 21s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 23s		the patch passed
+1 💚	mvnsite	2m 47s		the patch passed
+1 💚	javadoc	2m 5s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 56s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 22s		the patch passed
+1 💚	shadedclient	35m 34s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 12s		hadoop-common in the patch passed.
-1 ❌	unit	8m 50s	/patch-unit-hadoop-tools_hadoop-aws.txt	hadoop-aws in the patch passed.
+1 💚	asflicense	1m 14s		The patch does not generate ASF License warnings.
		273m 23s

Reason	Tests
Failed junit tests	hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/8/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux e55254152ab5 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `709163c`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/8/testReport/
Max. process+thread count	1263 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/8/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

steveloughran

changes are good, though test failures are due to DEFAULT_PREFETCH_MAX_BLOCKS_COUNT being < the #of blocks fetched; the tests. they will need to assert that the number of cached blocks matches min(default, requested blocks)

java.lang.IllegalStateException: waitForCaching: expected: 8, actual: 4, read errors: 0, caching errors: 0
	at org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.waitForCaching(TestS3ACachingBlockManager.java:360)
	at org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfPrefetched(TestS3ACachingBlockManager.java:288)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:750)

virajjasani · 2023-07-13T18:35:41Z

re-running the whole test suite with -prefetch and -scale combinations

hadoop-yetus · 2023-07-13T22:44:05Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 39s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 1s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 4 new or modified test files.
			_ trunk Compile Tests _
+0 🆗	mvndep	16m 28s		Maven dependency ordering for branch
+1 💚	mvninstall	31m 7s		trunk passed
+1 💚	compile	17m 24s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	16m 19s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	checkstyle	4m 23s		trunk passed
+1 💚	mvnsite	2m 49s		trunk passed
+1 💚	javadoc	2m 11s		trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 55s		trunk passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 4s		trunk passed
+1 💚	shadedclient	34m 22s		branch has no errors when building and testing our client artifacts.
-0 ⚠️	patch	34m 52s		Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 33s		Maven dependency ordering for patch
+1 💚	mvninstall	1m 26s		the patch passed
+1 💚	compile	16m 18s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	16m 18s		the patch passed
+1 💚	compile	16m 20s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	javac	16m 20s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	4m 16s		the patch passed
+1 💚	mvnsite	2m 50s		the patch passed
+1 💚	javadoc	2m 5s		the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	1m 57s		the patch passed with JDK Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
+1 💚	spotbugs	4m 22s		the patch passed
+1 💚	shadedclient	35m 13s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	19m 11s		hadoop-common in the patch passed.
+1 💚	unit	2m 56s		hadoop-aws in the patch passed.
+1 💚	asflicense	1m 11s		The patch does not generate ASF License warnings.
		247m 23s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/11/artifact/out/Dockerfile
GITHUB PR	#5754
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname	Linux 749d07bca684 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `796a3d4`
Default Java	Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~~us1-0ubuntu1~~20.04-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/11/testReport/
Max. process+thread count	2330 (vs. ulimit of 5500)
modules	C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5754/11/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

virajjasani · 2023-07-14T03:12:34Z

us-west-2:

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

$ mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale

test results look good

steveloughran

+1

steveloughran · 2023-07-14T16:06:56Z

merged, though now i'm using it that new test is way too slow. in my rebased unbuffered pr I have moved it to -Dscale, but really we can just set the block size down to something minimal and then work with a small file

…leFilePerBlockCache (#5754) Contributed by Viraj Jasani

virajjasani · 2023-07-15T00:59:10Z

created addendum PR for dealing with small file #5843

…leFilePerBlockCache (apache#5754) Contributed by Viraj Jasani

HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for Sing…

282ddd1

…leFilePerBlockCache

mehakmeet reviewed Jun 23, 2023

View reviewed changes

virajjasani added 2 commits June 23, 2023 23:12

Merge branch 'trunk' of github.com:apache/hadoop into HADOOP-18291-trunk

4112f9c

addendum

96b483c

addendum

108cb96

virajjasani requested a review from mehakmeet June 27, 2023 16:28

mehakmeet reviewed Jun 30, 2023

View reviewed changes

addendum

ab42829

addendum

87ea604

steveloughran requested changes Jun 30, 2023

View reviewed changes

Merge branch 'trunk' of github.com:apache/hadoop into HADOOP-18291-trunk

2c4e0cf

addendum

c72a556

addendum - minor optimization

2006290

virajjasani commented Jul 3, 2023

View reviewed changes

...-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3APrefetchingLruEviction.java Outdated Show resolved Hide resolved

virajjasani requested a review from mehakmeet July 3, 2023 20:29

steveloughran reviewed Jul 10, 2023

View reviewed changes

steveloughran requested changes Jul 12, 2023

View reviewed changes

virajjasani added 4 commits July 12, 2023 23:11

addendum

0060914

Merge branch 'trunk' of github.com:apache/hadoop into HADOOP-18291-trunk

fe06113

addendum

8caaec9

addendum

709163c

steveloughran requested changes Jul 13, 2023

View reviewed changes

addendum

796a3d4

steveloughran approved these changes Jul 14, 2023

View reviewed changes

steveloughran merged commit e7d74f3 into apache:trunk Jul 14, 2023

asfgit pushed a commit that referenced this pull request Jul 14, 2023

HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for Sing…

2805189

…leFilePerBlockCache (#5754) Contributed by Viraj Jasani

jiajunmao pushed a commit to jiajunmao/hadoop-MLEC that referenced this pull request Feb 6, 2024

HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for Sing…

1befe15

…leFilePerBlockCache (apache#5754) Contributed by Viraj Jasani

		ioStats = in.getIOStatistics();

		byte[] buffer = new byte[blockSize];

HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for SingleFilePerBlockCache #5754

HADOOP-18291. S3A prefetch - Implement thread-safe LRU cache for SingleFilePerBlockCache #5754

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment