Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better

Zihang Lai, Andrea Vedaldi
Visual Geometry Group, University of Oxford

A plug-and-play transformer layer to turn image-based models into state-of-the-art video models using point tracking.

🧠 Summary

Tracktention is a novel architectural module that improves temporal consistency in video tasks like depth estimation and colorization. It leverages modern point trackers to explicitly align features across frames using attention — converting powerful image-based models into robust, temporally aware video models with minimal overhead.

🔧 Features

Tracktention Layer: Enhances existing ViT/ConvNet with motion-aware temporal reasoning.
Plug-and-Play: Easily integrates into existing models like Depth Anything.
Lightweight: Only ~17M additional parameters with minimal runtime overhead.
State-of-the-Art: Outperforms leading video models in depth prediction and video colorization benchmarks.

🧬 Method

Tracktention consists of:

Attentional Sampling: Pool features from image tokens to track tokens using cross-attention.
Track Transformer: Propagate features along tracks for temporal consistency.
Attentional Splatting: Redistribute processed track tokens back to image tokens.

We use CoTracker3 to generate point tracks.

🧪 Usage

Note: Usage instructions will be provided once the codebase is officially released.

📄 Citation

If you use this code or Tracktention in your research, please cite:

@inproceedings{lai2025tracktention,
  title={Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better},
  author={Zihang Lai and Andrea Vedaldi},
  booktitle={CVPR},
  year={2025}
}

🌐 Project Page

👉 https://zlai0.github.io/TrackTention

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
docs		docs
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better

🧠 Summary

🔧 Features

🧬 Method

🧪 Usage

📄 Citation

🌐 Project Page

About

Releases

Packages

zlai0/TrackTention

Folders and files

Latest commit

History

Repository files navigation

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better

🧠 Summary

🔧 Features

🧬 Method

🧪 Usage

📄 Citation

🌐 Project Page

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages