8000 Read files with memory mapping by jaen · Pull Request #141 · evmar/n2 · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Read files with memory mapping #141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the c 8000 ommunity.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

Read files with memory mapping #141

wants to merge 6 commits into from

Conversation

jaen
Copy link
@jaen jaen commented Apr 29, 2025

I've been looking into trying to see how nix-ninja (which re-uses n2) would deal with building Android. Of course even if it works, it's ways off — but getting there would probably necessitate upstreaming their improvements. It turns out it probably won't happen on their end — pdtpartners/nix-ninja#27 (reply in thread) — so I've been looking into how hard it would be to do this myself. The less divergence from upstream, the better, after all.

I've picked the lowest hanging fruit I could think of, which is reading files via memory mapping — the Android build.ninja I've generated is ~2GB, so it's bound to take a while to read (and it will also be a memory hog). Contrary to the Android fork I've used a pre-existing memory mapping crate, hopefully abstracting out any platform differences.

I have also tried a few map-like data structures to associate FileIds to file handles and went with vanilla HashMap, at least for now (the Android fork used DashMap for concurrency, but I guess I'll evaluate that when I start porting actual concurrency). The way the code is structured (split between a FileMap trait and a FileLoader that can be be parameterised by any implementation) stems from that (it made it easier to swap implementations for benchmarking). I think it might make sense to leave it like that, because I feel like it's going to be useful when comparing how different data structures work with concurrency, but if you feel strongly about inlining it at this point, I don't mind either. I have also based this on nix-ninja changes, because I wanted to be able to test with it, but I don't mind rebasing it it on the master branch (or just waiting for those changes to be merged first)

This is at least partially based on the existing Android change, so it shouldn't be too egregiously wrong (at least both cargo test --release and manual tests with the Android build file seem to work), but I also haven't done a lot of low-level coding since the university and while I've followed Rust for over a decade, I don't have a whole lot of practical experience in actually writing non-trivial code in it. So here there could be dragons. There's two things I'm aware might be iffy (but there could be more):

  • from what I've understand of the original change, an additional page is mapped R/W at the end to ensure the file is null-terminated (which the parser requires). The library I've picked doesn't expose a method to use MMAP_FIXED (and it's as far as I can tell not cross-platform either). What I've done, is I also extended the mapping by a page, made all of it R/W, written 0 at the last byte of file and made it R/O again. While it works in tests, I'm not entirely sure it's correct — for one I have a nagging feeling that this only works as long as the file never ends at the last byte of the page. If it ever did and we had to write the null byte at the start of the next page, we'de probably get a SIGBUS. Do you think that's a reasonable suspicion? If so, I guess I could first add support for MMAP_FIXED to the memory mapping crate first and do exactly what the original PR had done,
  • I wanted to have a slightly nicer API at the read_file call site, so I've pushed resolving the file path from FileId to the FileLoader. But that necessitated passing at least part of the Loader below and to get it to work I had to use Rc<RefCell<_>>. Now, like I said my practical experience wit non-trivial Rust code is scarce, so I don't really have a good working intuition of borrow checker and when to use which smart pointer type. This is what had gotten the code working, but looks kinda eh with all those borrow_muts. If there's a better way to get FileLoader to resolve paths itself (instead of having to remember to do that at each call site) without having to resort to smart pointer zoo, then I can do that instead, but I'm not quite sure what that would be. I guess I could pass a closure closing over the path resolve method instead, but I don't think that's allow me to sidestep the issue, it will just move the Rcs to a different place? But like I said, I don't have a working intution of the borrows checker, so maybe there's some obvious solution I'm not seeing.
Here are some benchmarks I've done on my machine (Ryzen 9 5900HX, so not great, not terrible).
# Current
===
Timer precision: 10 ns
parse                  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ load_synthetic      739.4 µs      │ 1.242 ms      │ 759.8 µs      │ 771 µs        │ 100     │ 100
╰─ parser                            │               │               │               │         │
   ├─ file             1.52 s        │ 1.52 s        │ 1.52 s        │ 1.52 s        │ 1       │ 5
   ├─ file_via_loader  7.548 s       │ 7.64 s        │ 7.598 s       │ 7.595 s       │ 5       │ 25
   ╰─ synthetic        173.2 µs      │ 2.538 ms      │ 181.7 µs      │ 215.3 µs      │ 100     │ 100

# Mmap + vanilla HashMap
===
Timer precision: 20 ns
parse                  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ load_synthetic      731.5 µs      │ 1.208 ms      │ 752.5 µs      │ 759.5 µs      │ 100     │ 100
╰─ parser                            │               │               │               │         │
   ├─ file             1.199 s       │ 1.199 s       │ 1.199 s       │ 1.199 s       │ 1       │ 5
   ├─ file_via_loader  6.287 s       │ 6.342 s       │ 6.302 s       │ 6.312 s       │ 5       │ 25
   ╰─ synthetic        157.4 µs      │ 89.98 ms      │ 158.8 µs      │ 1.96 ms       │ 100     │ 100

# Mmap + DashMap
===
Timer precision: 20 ns
parse                  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ load_synthetic      730.1 µs      │ 1.323 ms      │ 749.3 µs      │ 758 µs        │ 100     │ 100
╰─ parser                            │               │               │               │         │
   ├─ file             1.205 s       │ 1.205 s       │ 1.205 s       │ 1.205 s       │ 1       │ 5
   ├─ file_via_loader  6.199 s       │ 6.303 s       │ 6.258 s       │ 6.252 s       │ 5       │ 25
   ╰─ synthetic        164.6 µs      │ 12.08 ms      │ 168.6 µs      │ 371.4 µs      │ 100     │ 100

# Running on the Android ninja file
~~~

# Current
===
[
{"pid":0, "name":"read file", "ts":121, "tid": 0, "ph":"X", "dur":735320}
,{"pid":0, "name":"loader.read_file", "ts":119, "tid": 0, "ph":"X", "dur":6884229}
,{"pid":0, "name":"db::open", "ts":6884351, "tid": 0, "ph":"X", "dur":117}
,{"pid":0, "name":"load::read", "ts":108, "tid": 0, "ph":"X", "dur":7192148}
,{"pid":0, "name":"work.run", "ts":7208336, "tid": 0, "ph":"X", "dur":2}
,{"pid":0, "name":"work.run", "ts":7212151, "tid": 0, "ph":"X", "dur":11}
,{"pid":0, "name":"main", "ts":0, "tid": 0, "ph":"X", "dur":7558642}
]

n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    7.77 secs    fish           external
   usr time    4.92 secs    0.75 millis    4.92 secs
   sys time    2.82 secs    1.86 millis    2.81 secs

# Mmap + vanilla HashMap
[
{"pid":0, "name":"read file", "ts":118, "tid": 0, "ph":"X", "dur":35}
,{"pid":0, "name":"loader.read_file", "ts":117, "tid": 0, "ph":"X", "dur":6045406}
,{"pid":0, "name":"db::open", "ts":6045526, "tid": 0, "ph":"X", "dur":118}
,{"pid":0, "name":"load::read", "ts":101, "tid": 0, "ph":"X", "dur":6377801}
,{"pid":0, "name":"work.run", "ts":6394685, "tid": 0, "ph":"X", "dur":3}
,{"pid":0, "name":"work.run", "ts":6398520, "tid": 0, "ph":"X", "dur":11}
,{"pid":0, "name":"main", "ts":0, "tid": 0, "ph":"X", "dur":6712011}
]


===
n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    6.94 secs    fish           external
   usr time    4.88 secs    1.22 millis    4.88 secs
   sys time    2.03 secs    2.14 millis    2.03 secs

# Mmap + DashMap
===
[
{"pid":0, "name":"read file", "ts":169, "tid": 0, "ph":"X", "dur":17}
,{"pid":0, "name":"loader.read_file", "ts":168, "tid": 0, "ph":"X", "dur":6340570}
,{"pid":0, "name":"db::open", "ts":6340740, "tid": 0, "ph":"X", "dur":129}
,{"pid":0, "name":"load::read", "ts":104, "tid": 0, "ph":"X", "dur":6673717}
,{"pid":0, "name":"work.run", "ts":6690839, "tid": 0, "ph":"X", "dur":2}
,{"pid":0, "name":"work.run", "ts":6694673, "tid": 0, "ph":"X", "dur":12}
,{"pid":0, "name":"main", "ts":0, "tid": 0, "ph":"X", "dur":7036351}
]

n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    7.26 secs    fish           external
   usr time    5.16 secs    1.10 millis    5.15 secs
   sys time    2.07 secs    1.07 millis    2.07 secs

So it looks like there's about a second of a gain, for a relatively straightforward change. Hopefully this PR is not too terribly and acceptable for upstreaming. If there's something I could have done differently let me know, and I'll try to improve that.

@jaen
Copy link
Author
jaen commented Apr 30, 2025

Turns out I was right with my SIGBUS suspicion — I added a test for that and copied the Linux-specific MMAP_FIXED part from the Android fork. This makes it platform-specific again, not quite sure what to do about that. I'll mark this as a draft in the mean time.

I have also added a benchmark comparing using memory mapping to loading the whole file into memory, had to expose a bit of internals for that, hopefully that's acceptable. I also bumped iterations a bit so that the numbers are a bit more meaningful — let me know if you want that reduced again.

Up to date benchmarks
Timer precision: 20 ns
loader                      fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ loader                                 │               │               │               │         │
   ├─ file_via_loader       6.392 s       │ 6.467 s       │ 6.409 s       │ 6.423 s       │ 3       │ 9
   ╰─ file_via_loader_slow  7.078 s       │ 7.145 s       │ 7.085 s       │ 7.103 s       │ 3       │ 9

# Android build with memmap
n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    6.98 secs    fish           external
   usr time    4.84 secs    0.16 millis    4.84 secs
   sys time    2.10 secs    1.06 millis    2.10 secs

# Android build without memmap
n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    7.27 secs    fish           external
   usr time    5.15 secs    0.03 millis    5.15 secs
   sys time    2.09 secs    1.00 millis    2.09 secs

@jaen jaen marked this pull request as draft April 30, 2025 08:41
pub struct Loader {
pub graph: graph::Graph,
_graph: Rc<RefCell<graph::Graph>>,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to wrap the graph like this, so I can share it with file_loader. Not sure how kosher that is, but I felt not having to remember about converting FileId to PathBuf at the site is nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0