8000 Optimize editing long text by caching each paragraph by afishhh · Pull Request #5411 · emilk/egui · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
8000

Optimize editing long text by caching each paragraph #5411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 77 commits into from
Apr 1, 2025

Conversation

afishhh
Copy link
Contributor
@afishhh afishhh commented Nov 28, 2024

What

(written by @emilk)
When editing long text (thousands of line), egui would previously re-layout the entire text on each edit. This could be slow.

With this PR, we instead split the text into paragraphs (split on \n) and then cache each such paragraph. When editing text then, only the changed paragraph needs to be laid out again.

Still, there is overhead from splitting the text, hashing each paragraph, and then joining the results, so the runtime complexity is still O(N).

In our benchmark, editing a 2000 line string goes from ~8ms to ~300 ms, a speedup of ~25x.

In the future, we could also consider laying out each paragraph in parallel, to speed up the initial layout of the text.

Details

This is an almost complete implementation of the approach described by emilk in this comment, excluding CoW semantics for LayoutJob (but including them for Row).
It supersedes the previous unsuccessful attempt here: #4000.

Draft because:

  • Currently individual rows will have ends_with_newline always set to false.
    This breaks selection with Ctrl+A (and probably many other things)
  • The whole block for doing the splitting and merging should probably become a function (I'll do that later).
  • I haven't run the check scr 8000 ipt, the tests, and haven't made sure all of the examples build (although I assume they probably don't rely on Galley internals).
  • Layout is sometimes incorrect (missing empty lines, wrapping sometimes makes text overlap).
  • A lot of text-related code had to be changed so this needs to be properly tested to ensure no layout issues were introduced, especially relating to the now row-relative coordinate system of Rows. Also this requires that we're fine making these very breaking changes.

It does significantly improve the performance of rendering large blocks of text (if they have many newlines), this is the test program I used to test it (adapted from #3086):

code
use eframe::egui::{self, CentralPanel, TextEdit};
use std::fmt::Write;

fn main() -> Result<(), eframe::Error> {
    let options = eframe::NativeOptions {
        ..Default::default()
    };

    eframe::run_native(
        "editor big file test",
        options,
        Box::new(|_cc| Ok(Box::<MyApp>::new(MyApp::new()))),
    )
}

struct MyApp {
    text: String,
}

impl MyApp {
    fn new() -> Self {
        let mut string = String::new();
        for line_bytes in (0..50000).map(|_| (0u8..50)) {
            for byte in line_bytes {
                write!(string, " {byte:02x}").unwrap();
            }
            write!(string, "\n").unwrap();
        }
        println!("total bytes: {}", string.len());
        MyApp { text: string }
    }
}

impl eframe::App for MyApp {
    fn update(&mut self, ctx: &egui::Context, _frame: &mut eframe::Frame) {
        CentralPanel::default().show(ctx, |ui| {
            let start = std::time::Instant::now();
            egui::ScrollArea::vertical().show(ui, |ui| {
                let code_editor = TextEdit::multiline(&mut self.text)
                    .code_editor()
                    .desired_width(f32::INFINITY)
                    .desired_rows(40);
                let response = code_editor.show(ui).response;
                if response.changed() {
                    println!("total bytes now: {}", self.text.len());
                }
            });
            let end = std::time::Instant::now();
            let time_to_update = end - start;
            if time_to_update.as_secs_f32() > 0.5 {
                println!("Long update took {:.3}s", time_to_update.as_secs_f32())
            }
        });
    }
}

I think the way to proceed would be to make a new type, something like PositionedRow, that would wrap an Arc<Row> but have a separate pos and ends_with_newline (that would mean Row only holds a size instead of a rect). This type would of course have getters that would allow you to easily get a Rect from it and probably a Deref to the underlying Row.
I haven't done this yet because I wanted to get some opinions whether this would be an acceptable API first. This is now implemented, but of course I'm still open to discussion about this approach and whether it's what we want to do.

Breaking changes (currently):

  • The Galley::rows field has a different type.
  • There is now a PlacedRow wrapper for Row.
  • Row now uses a coordinate system relative to itself instead of the Galley.

Copy link
github-actions bot commented Nov 28, 2024

Preview available at https://egui-pr-preview.github.io/pr/5411-cachegalleylines
Note that it might take a couple seconds for the update to show up after the preview_build workflow has completed.

@afishhh
Copy link
Contributor Author
afishhh commented Nov 29, 2024

Okay, so I did the thing and implemented the Row wrapper. This turned out to be more difficult than expected, and requires changing the semantics of Rows themselves. glyphs and the mesh(_bounds) must be relative to the Row itself, instead of the whole Galley, as otherwise repositioning them wouldn't work (unless we did something hacky like store two different offsets). Currently I also implemented Deref<Target = Row> for the PlacedRow wrapper (bike-shedding appreciated) which may not be the best idea as it makes it easy to confuse the different coordinate systems of the offset and non-offset Row.

Currently this "somewhat" works, I've managed to fix the selection issues I was seeing, but there is some layout issues you can actually see in the live demo above where the text is overlapping sometimes when wrapped (since this was present before the whole PlacedRow mess, I think this may be related to galley-merging).

@afishhh
Copy link
Contributor Author
afishhh commented Nov 30, 2024

So I've managed to make this look pretty correct, these are the remaining issues:

  • Rows now have very weird semantics because an empty Row that is in-between two non-empty lines will actually have a size of zero. This could be solved by actually incorporating the line-terminating newline into the cached job, but there is one issue with that approach because then each of the line galleys will now contain a trailing Row that would have to be treated specially by the merging code.
  • I've definitely introduced changes in rounding which is probably what makes the snapshot tests fail (apart from the fact that they sometimes segfault on my machine).
    Issues now solved (except the rendering_tests are still utterly broken on my machine)

@afishhh
Copy link
Contributor Author
afishhh commented Nov 30, 2024

I think I've made some good progress:

  • Made Row hold ends_with_newline again and just dropped excess rows during galley-merging, this means Row once again has pretty sensible semantics.
  • Properly respecting LayoutJob::round_output_size_to_nearest_ui_point fixed a lot of the snapshot test failures.

The remaining test failures are:

  • All rendering tests seem to have slight differences but in separators not in text which suggests that sizing might be ever so slightly wrong. Turns out this also happens on master for me, so it probably is unrelated to my changes.
    It looks like I can make more tests pass by using software rendering (WGPU_ADAPTER_NAME=llvmpipe) but still not all of them (dpi_1.25 and dpi_1.75 fail), so I'm going to pass this off as the graphics driver's fault.
  • The "Misc Demos" snapshot test also seems to fail, possibly due to a slightly incorrect alignment after the button widget? I've managed to remove the most hacky line of this patch and fix this in one fell swoop!.

I'm wondering whether it is worth fixing fixing these tests, should I just regenerate the snapshots instead?
Also I think stats don't take into account the fact multiple Galleys can point to the same Row.

@afishhh afishhh marked this pull request as ready for review December 1, 2024 23:57
Copy link
Owner
@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I haven't reviewed all of GalleyCache yet, because it is a very dense and complex function.

I wonder how we best test and/or benchmark this though 🤔

I defiantly feel we should have a benchmark to justify this code, i.e. something that shows a big speedup before/after this PR.

Comment on lines 598 to 599
// Note we **don't** ignore the leading/trailing whitespace here!
row.size.x = target_max_x - target_min_x;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? What does this affect?

Copy link
Contributor Author
@afishhh afishhh Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now that I think about it changing this comment was a mistake, leading and trailing whitespace is actually still ignored (I don't know why I thought otherwise). I'll change it back.
However ignoring whitespace here does seem kind of strange since the current docs for Row::rect say that it includes whitespace. Wouldn't this be wrong after it passes through this function? Maybe at least that documentation could be adjusted to clarify this is only true for left-aligned non-justified text.

I also noticed a more important issue though I think, this function actually makes Rows have glyphs that are outside their PlacedRow::pos + Row::size rectangle (because it just repositions the glyphs and leaves the position unchanged). This currently breaks selecting such text. I'll try to fix this soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the issue found above and the comment. Although I still want to know what you think about changing the Row::size comment, in general halign is surprising with how it suddenly totally changes how the coordinates work even though it makes perfect sense in hindsight.

@afishhh
Copy link
Contributor Author
afishhh commented Dec 4, 2024

I wonder how we best test and/or benchmark this though 🤔

I defiantly feel we should have a benchmark to justify this code, i.e. something that shows a big speedup before/after this PR.

I agree having a benchmark would be nice, currently the only thing I have is "it looks quicker" when you have a lot of text (deletion of small segments works instantly for example).
At first I wanted to suggest a kittest benchmark but then I looked in crates/egui_demo_lib/benches/benchmark.rs and noticed there already seem to be text layout benchmarks, so it'd probably be relatively easy to do something like:

  • Have a big (few megabytes) block of text
  • Lay it out
  • In the benchmark loop:
    • Randomly remove an N-byte segment from it (into a copy)
    • Lay it out

As for tests, I've encountered issues with selection drawing multiple times while working on this at this point, so maybe it would be beneficial if this was actually checked by the test suite, it looks like mouse events are supported in kittest so maybe that's a good start. Especially since I also think selection visuals might be another good place for optimization because currently selecting the whole text will force all Rows to be cloned again.

It also looks like the rendering_test tests still seem to be failing on CI. The diffs seem to contain three lines near the middle that look like the bottom of the text background, maybe the background is slightly too small/large? I have no idea why that might be though.

@afishhh afishhh force-pushed the cache_galley_lines branch from f406701 to 40f237d Compare December 5, 2024 17:45
@emilk emilk marked this pull request as ready for review March 30, 2025 18:31
@emilk
Copy link
Owner
emilk commented Mar 31, 2025

I need to take a break from this now.

The benchmark shows only about a ~2x speedup, which is less than I would have thought, so I think it is doing extra layouting even when it shouldn't. If anyone wants to take a look at that I would appreciate 9E88 it.

@afishhh
Copy link
Contributor Author
afishhh commented Apr 1, 2025

I don't know if this is the issue you're observing, but I noticed another performance issue in the current implementation.
It actually doesn't reuse any cached rows across frames more than one generation apart.
Whenever we lay out the entire job per-paragraph, the final galley is also inserted into the cache, which means that subsequent accesses will not update last_used for the individual rows.

Although even after completely disabling cache eviction, the re-layout still takes quite a bit of time even if just one row was invalidated, I'll see if that can be improved after fixing the eviction issue.

@afishhh
Copy link
Contributor Author
afishhh commented Apr 1, 2025

Okay I put off fixing the invalidation because it's non-trivial, but I managed to improve performance quite a bit by simply not re-rounding all row sizes in round_output_to_gui when called from Galley::concat, the sizes should already all be correctly rounded after initial layout and rounding them again here calls make_mut on all rows, which takes a lot of time.

@afishhh
Copy link
Contributor Author
afishhh commented Apr 1, 2025

Not adding the merged Galley into the cache entirely seems to still give acceptable performance so I opted for that as a fix for the invalidation issue.
If necessary something more complex like actually accounting for dependencies between CachedGalleys could be done but this seems to work fine.

// TODO(afishhh): This Galley cannot be added directly into the cache without taking
// extra precautions to make sure all component paragraph Galleys are not invalidated
// immediately next frame (since their `last_used` will not be updated).
Arc::new(galley)
Copy link
Owner
@emilk emilk Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we don't cache the full galley, and so do the paragraph-splitting and concatenation each frame… that's potentially a performance regression for large texts.

Better I think to just improve the cache to store the hashes of all sub-paragraphs and also update them as being used when the big parent is used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I implemented this, but I still seem to experience eviction of all the paragraphs in my practical application of this, although it doesn't seem to happen in a small test app. I've pushed the implementation for review (better name or approach for layout_internal?) and may investigate the weird cache evictions later (would blame deferred viewpoints but that wouldn't explain why it worked slightly better when re-splitting every frame).

@afishhh afishhh force-pushed the cache_galley_lines branch from ff1d46b to 9d61f9c Compare April 1, 2025 14:35
Copy link
Owner
@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, let's get this in!

Copy link
Owner
@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, let's get this in!

@emilk emilk merged commit 557bd56 into emilk:master Apr 1, 2025
25 checks passed
@valadaptive valadaptive mentioned this pull request Apr 1, 2025
13 tasks
valadaptive added a commit to valadaptive/egui that referenced this pull request Apr 2, 2025
valadaptive added a commit to valadaptive/egui that referenced this pull request Apr 24, 2025
lucasmerlin added a commit that referenced this pull request May 6, 2025
* Closes <#6904>
* [x] I have followed the instructions in the PR template

This was broken in #5411. Not sure if
this is the best fix or if `PlacedRow::rect` should be updated, but I
think it makes sense that PlacedRow::rect ignores leading space.
emilk pushed a commit that referenced this pull request May 8, 2025
#7031)

Fixes a regression introduced in #5411
(possibly
d74bee5)
that breaks `leading_space` handling.
I think this is what the condition should be but I haven't touched this
code in a while.
valadaptive added a commit to valadaptive/egui that referenced this pull request May 9, 2025
valadaptive added a commit to valadaptive/egui that referenced this pull request May 20, 2025
valadaptive added a commit to valadaptive/egui that referenced this pull request May 24, 2025
valadaptive added a commit to valadaptive/egui that referenced this pull request May 26, 2025
valadaptive added a commit to valadaptive/egui that referenced this pull request Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
egui epaint performance Lower CPU/GPU usage (optimize)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimizing re-layout of 1MB+ pieces of text in a TextEdit
2 participants
0