8000 tg4perfetto Additional instructions for Quick Installation / Quickstart · Issue #339 · mirage-project/mirage · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
tg4perfetto Additional instructions for Quick Installation / Quickstart #339
Open
@robrussell

Description

@robrussell

In order to get the example python demo/qwen3/demo.py running I followed the Quick Installation section but had to do something more like:

git clone --recursive --branch mpk https://www.github.com/mirage-project/mirage
cd mirage
python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
pip install transformers mpi4py google protobuf google protobuf3 tg4perfetto
pip install -e . -v
export MIRAGE_HOME=$pwd

This is with Python 3.10 on Ubuntu 22.04 in WSL2 (without conda). After that I was able to run python demo/qwen3/demo.py and python demo/qwen3/demo.py --use-mirage and see the reduced per-token latency from compilation.

There does still seem to be some failure with traces from tg4perfetto when I try python demo/qwen3/demo.py --use-mirage --profiling though. There was a release of tg4perfetto just yesterday. Neither 0.0.4 nor 0.0.6 worked in my case. Maybe it need to be installed with a particular git commit id?

Here are the errors from some attempts:

Version 0.0.4

$ pip install tg4perfetto==0.0.4
Collecting tg4perfetto==0.0.4
  Using cached tg4perfetto-0.0.4-py3-none-any.whl (208 kB)
Installing collected packages: tg4perfetto
Successfully installed tg4perfetto-0.0.4
$ python demo/qwen3/demo.py --use-mirage --profiling
Input arguments: Namespace(use_mirage=True, profiling=True)
world_size(1) rank(0)
Loading checkpoint shards: 100%|███████████████████████████████████| 5/5 [00:17<00:00,  3.44s/it]
Triggered events: 531
Executed tasks: 12002
Triggered events: 531
Executed tasks: 12002
Compiling megakernel using the following command line:
['/usr/local/cuda-12.6/bin/nvcc', '/tmp/tmpw913bysw/test.cu', '-O3', '-I/usr/include/python3.10', '-I/home/.../mirage/python/mirage/../../include', '-I/home/.../mirage/python/mirage/../../include/mirage/persistent_kernel', '-I/home/.../mirage/python/mirage/../../deps/cutlass/include', '-Ideps/json/include', '-arch=native', '-shared', '-std=c++17', '-rdc=true', '-use_fast_math', '-Xcompiler=-fPIC', '--expt-relaxed-constexpr', '-o', '/tmp/tmpw913bysw/test.cpython-38-x86_64-linux-gnu.so']
Finished megakernel compilation...
[SCHD] sched_id(37) first_worker(74) last_worker(76)
[SCHD] sched_id(16) first_worker(32) last_worker(34)

...

[SCHD] sched_id(35) first_worker(70) last_worker(72)
Finished Launch Persistent Kernel
Traceback (most recent call last):
  File "/home/.../mirage/demo/qwen3/demo.py", line 436, in <module>
    mpk()
  File "/home/.../mirage/python/mirage/persistent_kernel.py", line 591, in __call__
    from .profiler_persistent import export_to_perfetto_trace
  File "/home/.../mirage/python/mirage/profiler_persistent.py", line 9, in <module>
    from tg4perfetto import TraceGenerator
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/__init__.py", line 1, in <module>
    from ._tgen import TraceGenerator
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/_tgen.py", line 1, in <module>
    from ._core import _BaseTraceGenerator
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/_core.py", line 1, in <module>
    from . import perfetto_trace_pb2 as pb2
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/perfetto_trace_pb2.py", line 32, in <module>
    _descriptor.EnumValueDescriptor(
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/google/protobuf/descriptor.py", line 933, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Latest version (0.0.6):

$ pip uninstall tg4perfetto
Found existing installation: tg4perfetto 0.0.4
Uninstalling tg4perfetto-0.0.4:
...
Proceed (Y/n)?
  Successfully uninstalled tg4perfetto-0.0.4
$ pip install tg4perfetto
Collecting tg4perfetto
  Using cached tg4perfetto-0.0.6-py3-none-any.whl
Requirement already satisfied: protobuf in ./.venv/lib/python3.10/site-packages (from tg4perfetto) (6.31.1)
Installing collected packages: tg4perfetto
Successfully installed tg4perfetto-0.0.6
$ python demo/qwen3/demo.py --use-mirage --profiling
Input arguments: Namespace(use_mirage=True, profiling=True)
world_size(1) rank(0)
Loading checkpoint shards: 100%|███████████████████████████████████| 5/5 [00:18<00:00,  3.70s/it]
Triggered events: 531
Executed tasks: 12002
Triggered events: 531
Executed tasks: 12002
Compiling megakernel using the following command line:
['/usr/local/cuda-12.6/bin/nvcc', '/tmp/tmps8s3m7k1/test.cu', '-O3', '-I/usr/include/python3.10', '-I/home/.../mirage/python/mirage/../../include', '-I/home/.../mirage/python/mirage/../../include/mirage/persistent_kernel', '-I/home/.../mirage/python/mirage/../../deps/cutlass/include', '-Ideps/json/include', '-arch=native', '-shared', '-std=c++17', '-rdc=true', '-use_fast_math', '-Xcompiler=-fPIC', '--expt-relaxed-constexpr', '-o', '/tmp/tmps8s3m7k1/test.cpython-38-x86_64-linux-gnu.so']
Finished megakernel compilation...
[SCHD] sched_id(19) first_worker(38) last_worker(40).
..
[SCHD] sched_id(30) first_worker(60) last_worker(62)
[SCHD] sched_id(35) first_worker(70) last_worker(72)
Finished Launch Persistent Kernel
Traceback (most recent call last):
  File "/home/.../mirage/demo/qwen3/demo.py", line 436, in <module>
    mpk()
  File "/home/.../mirage/python/mirage/persistent_kernel.py", line 591, in __call__
    from .profiler_persistent import export_to_perfetto_trace
  File "/home/.../mirage/python/mirage/profiler_persistent.py", line 9, in <module>
    from tg4perfetto import TraceGenerator
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/__init__.py", line 1, in <module>
    from ._tgen import TraceGenerator
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/_tgen.py", line 1, in <module>
    from ._core import _BaseTraceGenerator
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/_core.py", line 1, in <module>
    from . import perfetto_trace_pb2 as pb2
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/tg4perfetto/perfetto_trace_pb2.py", line 33, in <module>
    _descriptor.EnumValueDescriptor(
  File "/home/.../mirage/.venv/lib/python3.10/site-packages/google/protobuf/descriptor.py", line 933, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Trying the PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python recommendation:

$ PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python python demo/qwen3/demo.py --use-mirage --profiling
Input arguments: Namespace(use_mirage=True, profiling=True)
world_size(1) rank(0)
Loading checkpoint shards: 100%|███████████████████████████████████| 5/5 [00:03<00:00,  1.27it/s]
Triggered events: 531
Executed tasks: 12002
Triggered events: 531
Executed tasks: 12002
Compiling megakernel using the following command line:
['/usr/local/cuda-12.6/bin/nvcc', '/tmp/tmplzj2xwtr/test.cu', '-O3', '-I/usr/include/python3.10', '-I/home/.../mirage/python/mirage/../../include', '-I/home/.../mirage/python/mirage/../../include/mirage/persistent_kernel', '-I/home/.../mirage/python/mirage/../../deps/cutlass/include', '-Ideps/json/include', '-arch=native', '-shared', '-std=c++17', '-rdc=true', '-use_fast_math', '-Xcompiler=-fPIC', '--expt-relaxed-constexpr', '-o', '/tmp/tmplzj2xwtr/test.cpython-38-x86_64-linux-gnu.so']
Finished megakernel compilation...
[SCHD] sched_id(15) first_worker(30) last_worker(32)
[SCHD] sched_id(18) first_worker(36) last_worker(38)
...
[SCHD] sched_id(16) first_worker(32) last_worker(34)
Finished Launch Persistent Kernel
Traceback (most recent call last):
  File "/home/.../mirage/demo/qwen3/demo.py", line 436, in <module>
    mpk()
  File "/home/.../mirage/python/mirage/persistent_kernel.py", line 593, in __call__
    export_to_perfetto_trace(
  File "/home/.../mirage/python/mirage/profiler_persistent.py", line 83, in export_to_perfetto_trace
    event = event_name_list[event_idx] + f"_{event_no}"
KeyError: 0

The trace file mirage_0.perfetto-trace is created but I don't know what's in it exactly.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0