Read files with memory mapping #141

jaen · 2025-04-29T22:30:33Z

I've been looking into trying to see how nix-ninja (which re-uses n2) would deal with building Android. Of course even if it works, it's ways off — but getting there would probably necessitate upstreaming their improvements. It turns out it probably won't happen on their end — pdtpartners/nix-ninja#27 (reply in thread) — so I've been looking into how hard it would be to do this myself. The less divergence from upstream, the better, after all.

I've picked the lowest hanging fruit I could think of, which is reading files via memory mapping — the Android build.ninja I've generated is ~2GB, so it's bound to take a while to read (and it will also be a memory hog). Contrary to the Android fork I've used a pre-existing memory mapping crate, hopefully abstracting out any platform differences.

I have also tried a few map-like data structures to associate FileIds to file handles and went with vanilla HashMap, at least for now (the Android fork used DashMap for concurrency, but I guess I'll evaluate that when I start porting actual concurrency). The way the code is structured (split between a FileMap trait and a FileLoader that can be be parameterised by any implementation) stems from that (it made it easier to swap implementations for benchmarking). I think it might make sense to leave it like that, because I feel like it's going to be useful when comparing how different data structures work with concurrency, but if you feel strongly about inlining it at this point, I don't mind either. I have also based this on nix-ninja changes, because I wanted to be able to test with it, but I don't mind rebasing it it on the master branch (or just waiting for those changes to be merged first)

This is at least partially based on the existing Android change, so it shouldn't be too egregiously wrong (at least both cargo test --release and manual tests with the Android build file seem to work), but I also haven't done a lot of low-level coding since the university and while I've followed Rust for over a decade, I don't have a whole lot of practical experience in actually writing non-trivial code in it. So here there could be dragons. There's two things I'm aware might be iffy (but there could be more):

from what I've understand of the original change, an additional page is mapped R/W at the end to ensure the file is null-terminated (which the parser requires). The library I've picked doesn't expose a method to use MMAP_FIXED (and it's as far as I can tell not cross-platform either). What I've done, is I also extended the mapping by a page, made all of it R/W, written 0 at the last byte of file and made it R/O again. While it works in tests, I'm not entirely sure it's correct — for one I have a nagging feeling that this only works as long as the file never ends at the last byte of the page. If it ever did and we had to write the null byte at the start of the next page, we'de probably get a SIGBUS. Do you think that's a reasonable suspicion? If so, I guess I could first add support for MMAP_FIXED to the memory mapping crate first and do exactly what the original PR had done,
I wanted to have a slightly nicer API at the read_file call site, so I've pushed resolving the file path from FileId to the FileLoader. But that necessitated passing at least part of the Loader below and to get it to work I had to use Rc<RefCell<_>>. Now, like I said my practical experience wit non-trivial Rust code is scarce, so I don't really have a good working intuition of borrow checker and when to use which smart pointer type. This is what had gotten the code working, but looks kinda eh with all those borrow_muts. If there's a better way to get FileLoader to resolve paths itself (instead of having to remember to do that at each call site) without having to resort to smart pointer zoo, then I can do that instead, but I'm not quite sure what that would be. I guess I could pass a closure closing over the path resolve method instead, but I don't think that's allow me to sidestep the issue, it will just move the Rcs to a different place? But like I said, I don't have a working intution of the borrows checker, so maybe there's some obvious solution I'm not seeing.

Here are some benchmarks I've done on my machine (Ryzen 9 5900HX, so not great, not terrible).

# Current
===
Timer precision: 10 ns
parse                  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ load_synthetic      739.4 µs      │ 1.242 ms      │ 759.8 µs      │ 771 µs        │ 100     │ 100
╰─ parser                            │               │               │               │         │
   ├─ file             1.52 s        │ 1.52 s        │ 1.52 s        │ 1.52 s        │ 1       │ 5
   ├─ file_via_loader  7.548 s       │ 7.64 s        │ 7.598 s       │ 7.595 s       │ 5       │ 25
   ╰─ synthetic        173.2 µs      │ 2.538 ms      │ 181.7 µs      │ 215.3 µs      │ 100     │ 100

# Mmap + vanilla HashMap
===
Timer precision: 20 ns
parse                  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ load_synthetic      731.5 µs      │ 1.208 ms      │ 752.5 µs      │ 759.5 µs      │ 100     │ 100
╰─ parser                            │               │               │               │         │
   ├─ file             1.199 s       │ 1.199 s       │ 1.199 s       │ 1.199 s       │ 1       │ 5
   ├─ file_via_loader  6.287 s       │ 6.342 s       │ 6.302 s       │ 6.312 s       │ 5       │ 25
   ╰─ synthetic        157.4 µs      │ 89.98 ms      │ 158.8 µs      │ 1.96 ms       │ 100     │ 100

# Mmap + DashMap
===
Timer precision: 20 ns
parse                  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ load_synthetic      730.1 µs      │ 1.323 ms      │ 749.3 µs      │ 758 µs        │ 100     │ 100
╰─ parser                            │               │               │               │         │
   ├─ file             1.205 s       │ 1.205 s       │ 1.205 s       │ 1.205 s       │ 1       │ 5
   ├─ file_via_loader  6.199 s       │ 6.303 s       │ 6.258 s       │ 6.252 s       │ 5       │ 25
   ╰─ synthetic        164.6 µs      │ 12.08 ms      │ 168.6 µs      │ 371.4 µs      │ 100     │ 100

# Running on the Android ninja file
~~~

# Current
===
[
{"pid":0, "name":"read file", "ts":121, "tid": 0, "ph":"X", "dur":735320}
,{"pid":0, "name":"loader.read_file", "ts":119, "tid": 0, "ph":"X", "dur":6884229}
,{"pid":0, "name":"db::open", "ts":6884351, "tid": 0, "ph":"X", "dur":117}
,{"pid":0, "name":"load::read", "ts":108, "tid": 0, "ph":"X", "dur":7192148}
,{"pid":0, "name":"work.run", "ts":7208336, "tid": 0, "ph":"X", "dur":2}
,{"pid":0, "name":"work.run", "ts":7212151, "tid": 0, "ph":"X", "dur":11}
,{"pid":0, "name":"main", "ts":0, "tid": 0, "ph":"X", "dur":7558642}
]

n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    7.77 secs    fish           external
   usr time    4.92 secs    0.75 millis    4.92 secs
   sys time    2.82 secs    1.86 millis    2.81 secs

# Mmap + vanilla HashMap
[
{"pid":0, "name":"read file", "ts":118, "tid": 0, "ph":"X", "dur":35}
,{"pid":0, "name":"loader.read_file", "ts":117, "tid": 0, "ph":"X", "dur":6045406}
,{"pid":0, "name":"db::open", "ts":6045526, "tid": 0, "ph":"X", "dur":118}
,{"pid":0, "name":"load::read", "ts":101, "tid": 0, "ph":"X", "dur":6377801}
,{"pid":0, "name":"work.run", "ts":6394685, "tid": 0, "ph":"X", "dur":3}
,{"pid":0, "name":"work.run", "ts":6398520, "tid": 0, "ph":"X", "dur":11}
,{"pid":0, "name":"main", "ts":0, "tid": 0, "ph":"X", "dur":6712011}
]


===
n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    6.94 secs    fish           external
   usr time    4.88 secs    1.22 millis    4.88 secs
   sys time    2.03 secs    2.14 millis    2.03 secs

# Mmap + DashMap
===
[
{"pid":0, "name":"read file", "ts":169, "tid": 0, "ph":"X", "dur":17}
,{"pid":0, "name":"loader.read_file", "ts":168, "tid": 0, "ph":"X", "dur":6340570}
,{"pid":0, "name":"db::open", "ts":6340740, "tid": 0, "ph":"X", "dur":129}
,{"pid":0, "name":"load::read", "ts":104, "tid": 0, "ph":"X", "dur":6673717}
,{"pid":0, "name":"work.run", "ts":6690839, "tid": 0, "ph":"X", "dur":2}
,{"pid":0, "name":"work.run", "ts":6694673, "tid": 0, "ph":"X", "dur":12}
,{"pid":0, "name":"main", "ts":0, "tid": 0, "ph":"X", "dur":7036351}
]

n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    7.26 secs    fish           external
   usr time    5.16 secs    1.10 millis    5.15 secs
   sys time    2.07 secs    1.07 millis    2.07 secs

So it looks like there's about a second of a gain, for a relatively straightforward change. Hopefully this PR is not too terribly and acceptable for upstreaming. If there's something I could have done differently let me know, and I'll try to improve that.

…s on Build

src/load.rs

jaen · 2025-04-30T08:41:03Z

Turns out I was right with my SIGBUS suspicion — I added a test for that and copied the Linux-specific MMAP_FIXED part from the Android fork. This makes it platform-specific again, not quite sure what to do about that. I'll mark this as a draft in the mean time.

I have also added a benchmark comparing using memory mapping to loading the whole file into memory, had to expose a bit of internals for that, hopefully that's acceptable. I also bumped iterations a bit so that the numbers are a bit more meaningful — let me know if you want that reduced again.

Up to date benchmarks

Timer precision: 20 ns
loader                      fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ loader                                 │               │               │               │         │
   ├─ file_via_loader       6.392 s       │ 6.467 s       │ 6.409 s       │ 6.423 s       │ 3       │ 9
   ╰─ file_via_loader_slow  7.078 s       │ 7.145 s       │ 7.085 s       │ 7.103 s       │ 3       │ 9

# Android build with memmap
n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    6.98 secs    fish           external
   usr time    4.84 secs    0.16 millis    4.84 secs
   sys time    2.10 secs    1.06 millis    2.10 secs

# Android build without memmap
n2: error: build.ninja:528568: input out/host/linux-x86/bin/go/androidmk/linux_glibc_x86_64/obj/androidmk missing

________________________________________________________
Executed in    7.27 secs    fish           external
   usr time    5.15 secs    0.03 millis    5.15 secs
   sys time    2.09 secs    1.00 millis    2.09 secs

jaen · 2025-04-30T08:45:21Z

src/load.rs

 pub struct Loader {
-    pub graph: graph::Graph,
+    _graph: Rc<RefCell<graph::Graph>>,


I had to wrap the graph like this, so I can share it with file_loader. Not sure how kosher that is, but I felt not having to remember about converting FileId to PathBuf at the site is nice.

hinshun added 5 commits March 31, 2025 20:49

Make modules graph, densemap and depfile public for external tools

e7327a4

Refactor parse_showincludes as method on Build to expose value of dep…

d34a26f

…s on Build

Refactor graph::Build ins & outs to BuildDependencies and derive Clone

2d1d137

Derive Clone on graph::File

328d703

Move all_ids implementation into DenseMap

341b511

jaen commented Apr 29, 2025

View reviewed changes

src/load.rs Outdated Show resolved Hide resolved

Read build files using memory mapping

badb109

jaen force-pushed the feature/mmap branch from df7dde8 to badb109 Compare April 30, 2025 08:30

jaen marked this pull request as draft April 30, 2025 08:41

jaen commented Apr 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Read files with memory mapping #141

Read files with memory mapping #141

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Read files with memory mapping #141

Are you sure you want to change the base?

Read files with memory mapping #141

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!