8000 Tags · SiQube/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Tags: SiQube/pytorch

Tags

ciflow/trunk/139952

Toggle ciflow/trunk/139952's commit message
Use TORCH_DECLARE_XXX

ciflow/trunk/139837

Toggle ciflow/trunk/139837's commit message
Update on "[pytorch/profiler] Profiler NCCL metadata can now contain …

…collective Input and Ouput Tensor addrs"


Studying memory access patterns is the primary use cases.

Internal: The data may be used to find the % of operators that may cause alignment related overhead.

Differential Revision: [D64413699](https://our.internmc.facebook.com/intern/diff/D64413699/)

cc robieta chaekit guotuofeng guyang3532 dzhulgakov davidberard98 briancoutinho sraikund16

[ghstack-poisoned]

ciflow/trunk/139588

Toggle ciflow/trunk/139588's commit message
update executorch commit hash

ciflow/trunk/125806

Toggle ciflow/trunk/125806's commit message
update vision commit hash

ciflow/mps/140196

Toggle ciflow/mps/140196's commit message

ciflow/mps/139952

Toggle ciflow/mps/139952's commit message
Use TORCH_DECLARE_XXX

ciflow/inductor/140195

Toggle ciflow/inductor/140195's commit message
improve heuristics for operator reordering for peak memory

[ghstack-poisoned]

ciflow/inductor/139849

Toggle ciflow/inductor/139849's commit message
Update on "[logging] Overhaul dynamo_timed and CompilationMetrics log…

…ging."


Here's the overview:

There's a new contextmanager singleton called MetricsContext. Entering the MetricsContext is how we demarcate the boundary on which we'll create a single CompilationMetrics object, and therefore, a single dynamo_compile log entry. While we're inside the MetricsContext, we can update/set many different metrics. Most importantly: `dynamo_timed` can also update the in-progress MetricsContext. In the proposal here, we tell `dynamo_timed` that we want it to do so by providing the name of the MetricsContext field to increment. There can be many `dynamo_timed` calls in different parts of the code updating different fields. Then when the MetricsContext exits, that's when the logging of everything gathered finally happens. One potential footgun is trying to use `dynamo_timed` when we haven't entered the MetricsContext, but we assert on that problem. Another problem is that we re-enter the context recursively, but we watch for that and do the logging only when the outermost exits.

Some specifics:
* Introduce MetricsContext - a context manager that on exit, records the CompilationMetrics (which also logs to dynamo_compile).
* Completely remove the concept of frame_phase_timing. Instead, update the MetricsContext during compilation, either directly or via dynamo_timed.
* Remove some globals we previously used to accumulate counters to later populate a CompilationMetrics. We use CompilationMetrics set/update/increment APIs instead.
* `record_compilation_metrics` is now called on exit from MetricsContext.
* Populate legacy CompilationMetrics fields right before logging, inside `record_compilation_metrics`.
* Remove the one-off `add_remote_cache_time_saved` helper; capture that timing directly into the MetricsContext.

And specifically, several changes to dynamo_timed:
* "Modernize" the parameters and update all callsites accordingly.
* Move the backwards logging of the CompilationMetrics to the backwards compile location.
* Add a parameter for which CompilationMetrics field to update

cc ezyang SherlockNoMad EikanWang jgong5 wenzhe-nrv voznesenskym penguinwu Guobing-Chen XiaobingSuper zhuhaozhe blzheng jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]

ciflow/inductor/139839

Toggle ciflow/inductor/139839's commit message
[Inductor] Expand dtype aware codegen for unary ops (pytorch#139839)

Summary:

Previously, only `torch.sqrt` was dtype aware. This PR updates most of the unary ops to have it as well. This is necessary to get correct code for libdevice and tl.math ops, as their inputs need to be upcasted to float32.

Test Plan: Added CI tests for all the new ops.

Differential Revision: D65517197

ciflow/inductor/139588

Toggle ciflow/inductor/139588's commit message
update executorch commit hash

0