vllm use cuda graph for generation #115

parthchadha · 2025-04-01T21:28:26Z

Is your feature request related to a problem? Please describe.

A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like

A clear and concise description of what you want to happen.
Provide a code snippet on how new APIs/changes would be used by others.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context or screenshots about the feature request here

parthchadha self-assigned this Apr 1, 2025

parthchadha mentioned this issue Apr 1, 2025

feat: use cuda_graph by default for vllm #116

Merged

4 tasks

parthchadha closed this as completed in #116 Apr 1, 2025

Provide feedback