8000 Fix negative duration panic on macos by Enter-tainer · Pull Request #475 · linebender/vello · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix negative duration panic on macos #475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 4, 2024

Conversation

Enter-tainer
Copy link
Contributor

The winit example sometimes panics when running on macos. This pr fix this by adding a .max(0.0) to clip it to 0 before making a Duration

can not convert float seconds to Duration: value is negative
stack backtrace:
   0: rust_begin_unwind
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: core::panicking::panic_display
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:196:5
   3: core::time::Duration::from_secs_f64::panic_cold_display
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panic.rs:99:13
   4: core::time::Duration::from_secs_f64
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/time.rs:762:23
   5: with_winit::stats::draw_gpu_profiling::{{closure}}
             at ./examples/with_winit/src/stats.rs:389:17
   6: with_winit::stats::traverse_profiling
             at ./examples/with_winit/src/stats.rs:443:9
   7: with_winit::stats::draw_gpu_profiling
             at ./examples/with_winit/src/stats.rs:348:5
   8: with_winit::run::{{closure}}
             at ./examples/with_winit/src/lib.rs:418:33

@@ -386,7 +386,7 @@ pub fn draw_gpu_profiling(
cur_text_y = text_y + text_height;
let label = format!(
"{:.2?} - {:.30}",
Duration::from_secs_f64(this_time),
Duration::from_secs_f64(this_time.max(0.0)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is a bug in wgpu or something else where a monotonic timer isn't being used ...

@DJMcNab
Copy link
Member
DJMcNab commented Feb 29, 2024

Thanks for the PR - I just have one question, which is have you validated how negative the number is?

I ask because if it's very small ($\approx -\frac{1}{5000}$, i.e. -0.2ms), then that's probably fine, but if it's much larger than that, I suspect there's some data corruption going on. If that's the case, we should add a note to these text values of the actual amount, instead of just clamping it to zero.

I believe @DasLixou was seeing something similar on the Zulip, where the timings from the profiler didn't seem right. Although in that case, it was AMD DirectX12 rather than macOS

@Enter-tainer
Copy link
Contributor Author

It is about -20ms 🤔 I guess it might be a problem?

My device is MacBook M2 Pro 32GB

image image

@DJMcNab
Copy link
Member
DJMcNab commented Feb 29, 2024

Yeah, that suggests there's some definite weirdness going on. The number always being the same is also extremely suspect - that suggests there's something more specific going on.

Can you print the byte value of that number (i.e. eprintln!("{:x?}", this_time.to_bits())?

@Enter-tainer
Copy link
Contributor Author
Enter-tainer commented Feb 29, 2024
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Many same lines...
Negative time: -0.01713495794683695, bits: bf918bd390000000
...
Negative time: -0.01640879176557064, bits: bf90cd7760000000
...
Details

Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.014637250453233719, bits: bf8dfa2280000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01713495794683695, bits: bf918bd390000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000
Negative time: -0.01640879176557064, bits: bf90cd7760000000

@DJMcNab
Copy link
Member
DJMcNab commented Feb 29, 2024

I think this PR should show the negative value in the visible output (instead of zero). That is, by producing a Duration of the absolute value.
Maybe we should also log?

It's not clear to me why this is happening - @Wumpf have you got any ideas?

Edit: To be clear - I'll happily merge this if it doesn't clamp but shows the negative value, perhaps with an (!) for warning. Because it's important that this works, even though it's not working as well as it should

@DasLixou
Copy link
Contributor

No I did some more digging with the profiler, my other graphics card works fine, but the RX 6600 (now with latest drivers) somehow misplaces the timing for the scopes (parent scope sometimes begin 40ms after all the other stuff has already run. This happens for dx12 and Vulkan so I think it's a bug in the driver giving back the time, but I'm not sure and don't know any further now

@DasLixou
Copy link
Contributor

OK just read the whole conversation, if this is the same problem as I have, then i think its a bug between wgpu_profiler and the graphics card. Maybe we could do a crappy wgpu render example with wgpu_profiler and see if there's also the problem

@Wumpf
Copy link
Wumpf commented Feb 29, 2024

I believe this is somewhere between known & hard to avoid issue and problematic barrier placement in wgpu 🤔
I'm a little bit surprised it happens on dx12 & Vulkan since I've experienced this most commonly on Metal so far: The issue is that right now it's a bit too poorly defined what waits on what in the presence of timer scopes, so pairs of queries may internally be placed with different command buffer/encoder. This is also why the WebGPU spec recently removed timer queries for anything but the ones that are directly placed on compute/draw-pass boundaries.

@Wumpf
Copy link
Wumpf commented Feb 29, 2024

that said this all depends ofc where much on the use of timer scopes. I'll see if I can find a bit of time to poke around here to learn more - any particular places in the code that I should look at?

@DJMcNab
Copy link
Member
DJMcNab commented Feb 29, 2024

Thanks - at least it's not too unexpected. Is there a wgpu issue we can track?
I'm slightly concerned that WebGPU is removing this - it is useful to have a view into what's taking the most time.

The main use of the scopes is in WgpuEngine::run_recording, and a lot of the runtime/interaction with wgpu_profiler is in the with_winit crate, which is our main demo

@Wumpf
Copy link
Wumpf commented Mar 1, 2024

These two are fitting best (given that it's really the issue I'm suspecting):

@DJMcNab
Copy link
Member
DJMcNab commented Mar 1, 2024

I don't think those explain the really odd behaviour we're seeing.

Like, I don't think it's reasonable for a dispatch to finish 20ms before it started.

Those issues seem to be discussing how two timestamps can happen arbitrarily close to one another, even if the passes themselves are slow. But I don't think that should mean one timestamp happens before the other?

@Wumpf
Copy link
Wumpf commented Mar 1, 2024

Since gpu processing of things that aren't protected by barriers can be reordered in execution arbitrarily it is still very relevant. Right now it can easily happen that one query is part of a block of memory operations and another is part of a dispatch and then the driver/gpu decides to reorder one after the other (that sort of thing is what happens on wgpu's Metal implementation a lot because of the way it creates MTLCommandEncoder on the fly).

But there could be more to it. Probably also best to check with RenderDoc/XCode-frame-profiler to see where which barrier and which query ends up

Enter-tainer and others added 2 commits March 3, 2024 00:31
< 8000 input type="hidden" name="disable_live_updates" value="false" autocomplete="off" data-targets="batch-deferred-content.inputs" />
Copy link
Member
@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've also added a comment which explains this

@DJMcNab DJMcNab enabled auto-merge March 4, 2024 08:39
@DJMcNab DJMcNab added this pull request to the merge queue Mar 4, 2024
Merged via the queue into linebender:main with commit 387c7be Mar 4, 2024
@Enter-tainer Enter-tainer deleted the fix-panic branch March 4, 2024 09:22
@xStrom
Copy link
Member
xStrom commented Mar 5, 2024

I don't think those explain the really odd behaviour we're seeing.

Like, I don't think it's reasonable for a dispatch to finish 20ms before it started.

Those issues seem to be discussing how two timestamps can happen arbitrarily close to one another, even if the passes themselves are slow. But I don't think that should mean one timestamp happens before the other?

The Vulkan spec only provides vkCmdWriteTimestamp which has happens-after syncing. That means that even a malicious implementation that returns the same value for all calls to vkCmdWriteTimestamp is compliant. In less malicious cases you can still end up with a scenario described in the first post of gpuweb#3952:

writeTimestamp(A)
do_some_gpu_work(...)
writeTimestamp(B)

// Running via Vulkan, gets reordered on the GPU as:

do_some_gpu_work(...)
writeTimestamp(A) // Totally compliant with Vulkan spec
writeTimestamp(B)

So the quality of measurements will depend on how much the GPU manufacturer went beyond the spec in retaining execution order. Very unfortunate, especially as the reordering can be dynamic and depend on the measured instructions. So measuring a different execution time may just mean that the GPU optimizer resulted in a different execution order.


The issue on macOS is slightly different, because Metal doesn't provide even the already weak options that Vulkan has. Or at least the implementations haven't found those options yet. I didn't find a Metal specific wgpu issue but for Dawn there is this issue. Dawn has a custom solution for writing timestamps on Metal and that solution is completely broken in a way where different timestamp writes can be completely reorderd.

writeTimestamp(A)
do_some_gpu_work(...)
writeTimestamp(B)

// Running via Metal, gets even worse:

do_some_gpu_work(...)
writeTimestamp(B)
writeTimestamp(A) // The joys of the current Dawn implementation

Even though this is a Dawn issue and I didn't find the matching wgpu issue, there is good reason to believe that wgpu suf 628C fers from the same problem. That's because I looked through the notes of two different WebGPU meetings (1, 2) and the problem is discussed. There was no mention that it's solved in wgpu. Instead we get:

Couldn't make writeTimestamp work as expected on Metal

I would like to reintroduce writeTimestamp but not 100% sure how to do it.

can do an investigation for writeTimestamp offline (for Metal) and come back to writeTimestamp later?


In summary, here's where we stand right now:

  • writeTimestamp was removed from the WebGPU spec (gpuweb#4370).
  • The already existing implementation on Vulkan may not work as developers expect, because the spec demands only happens-after sync.
  • The already existing implementation on Metal is completely unreliable, negative deltas should be fully expected right now.
  • Hopefully a better solution is found for Metal, maybe also for Vulkan & DX12. More reliable timers that have a performance hit cost are being considered. After more clarity is found, timers will be added back to the WebGPU spec.

@DJMcNab
Copy link
Member
DJMcNab commented Mar 5, 2024

Even taking that into account, which I agree definitely permits some negative durations, I still think there's some strangeness. I'm less sure that permits this negative duration.

I'm assuming two fairly weak invariants in making that assertion:

  1. The timestamp can't be at a time before its command was sent to the GPU
  2. The timestamp can't be at a time after its data was recieved from the GPU

The 20ms gap is long enough for a complete cycle to have been done, i.e. if everything is working ccorrectly, we're seeing one of those invariants be broken (also because we're blocking on the timestamp data being returned, because of using Poll::Wait in our maintain call)
Which means that either:

  1. Metal's timestamp API is fundamentally meaningless, in that it just makes up any time it likes
  2. We have some data corruption occuring

@xStrom
Copy link
Member
xStrom commented Mar 5, 2024

20ms does seem awfully long for M2 Pro, so indeed there's probably an additional factor.

I checked the timer Metal code in wgpu and it does seem to have the same strategy as Dawn. It also has a bunch of comments describing various bugs with Metal.

So we know for a fact that both the wgpu timer implementation is buggy and that Metal has bugs in relation to timers. Wouldn't be that surprising if there is another bug in there causing the 20ms result.

@Wumpf
Copy link
Wumpf commented Mar 6, 2024

excellent writeup @xStrom !
And oh god that metal implementation was awful. There's a higher level issue in wgpu on top around we set up new CommandEncoder (even when not hitting the bug-case with the lengthy description): it can very easily happen that the timestamp ends up on the "wrong one" which has nothing to do with the thing you're trying to measure, which is essentially "encouraging" metal to schedule it somewhere else entirely. The only thing that is somewhat reliable is timestamp writes that are specified as part of a render/compute pass. This is also the only thing that remains in WebGPU for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0