CUDA Programming Tutorials

A comprehensive collection of CUDA programming examples for teaching parallel computing concepts using NVIDIA GPUs. This repository is designed for students learning CUDA programming in Windows environments with Visual Studio.

Repository Structure

Each tutorial is a self-contained project with full CMake support, detailed documentation, and batch files for easy building and execution:

01-cuda-basics - Introduction to CUDA fundamentals and device querying
02-vector-addition - Basic vector addition operations
03-memory-types - Global, shared, constant, and texture memory types
04-thread-organization - Thread, block, and grid organizations
05-matrix-multiplication - Different matrix multiplication strategies
06-reduction-operations - Parallel reduction operations
07-atomic-operations - Atomic operations and thread safety
08-stream-processing - Asynchronous operations and CUDA streams
09-texture-processing - Image processing and texture memory usage
10-dynamic-parallelism - Dynamic parallelism
11-cuda-libraries - CUDA libraries usage
12-multi-gpu - Multi-GPU programming
13-unified-memory - Unified memory model
14-optimization - Performance optimization techniques
15-debugging - Debugging CUDA applications

Prerequisites

Windows 10/11
Visual Studio 2022 Community Edition
CUDA Toolkit 12.0 or newer
CMake 3.20 or higher
C++17 compatible compiler
NVIDIA GPU with Compute Capability 3.5 or higher

Getting Started

Setting Up Your Environment

Install Visual Studio 2022 Community Edition
- During installation, select "Desktop development with C++"
- Ensure MSBuild tools are installed
Install CUDA Toolkit
- Download from NVIDIA CUDA Toolkit
- Select your version of Windows
- Follow the installation instructions
- Verify the installation with nvcc --version in a command prompt
Install CMake
- Download from CMake
- Add CMake to the system PATH during installation

Building and Running the Examples

Each project follows the same pattern for building and execution:

Navigate to the project directory

cd 01-cuda-basics

Configure the project

configure.bat

Build the project

build_all.bat

Run the example

run.bat

Advanced Run Options

Most examples support additional command-line parameters:

run.bat --debug         # Run in debug mode with additional information
run.bat --release       # Run in release mode (optimized)
run.bat --benchmark     # Run performance benchmarks
run.bat --threads N     # Set specific thread configuration
run.bat --help          # Show all available options

Learning Path

For the best learning experience, follow the tutorials in numerical order:

Start with 01-cuda-basics to understand CUDA fundamentals
Move through vector and matrix operations (02 through 05)
Learn advanced topics like synchronization and atomic operations (06 and 07)
Explore performance optimization and techniques (08 through 14)
Finally, learn debugging techniques with 15-debugging

Each directory contains its own detailed README.md with:

Concepts explanation
Code walkthrough
Build and run instructions
Expected output and interpretation
Exercises and further exploration ideas

Common Issues and Troubleshooting

CUDA Installation Problems

Ensure your NVIDIA driver is up to date
Make sure you've installed the CUDA Toolkit that matches your system architecture
Verify the installation with nvcc --version

Build Errors

Check if CMake can find CUDA with cmake .. -G "Visual Studio 17 2022" -A x64
Ensure your GPU has the required Compute Capability
Check Visual Studio has the necessary C++ components installed

Runtime Errors

Verify your GPU driver is up to date
Check Windows Device Manager to ensure your GPU is recognized
Run examples in debug mode for more information: run.bat --debug

Additional Resources

Contributing

Contributions to improve the tutorials are welcome! Please feel free to submit pull requests or open issues to suggest improvements.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

NVIDIA for the CUDA platform and documentation
The parallel computing community for sharing knowledge and best practices
All contributors who help improve these teaching materials

Happy CUDA Programming!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CUDA Programming Tutorials

Repository Structure

Prerequisites

Getting Started

Setting Up Your Environment

Building and Running the Examples

Advanced Run Options

Learning Path

Common Issues and Troubleshooting

CUDA Installation Problems

Build Errors

Runtime Errors

Additional Resources

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
01-cuda-basics		01-cuda-basics
02-vector-addition		02-vector-addition
03-memory-types		03-memory-types
04-thread-organization		04-thread-organization
05-matrix-multiplication		05-matrix-multiplication
06-reduction-operations		06-reduction-operations
07-atomic-operations		07-atomic-operations
08-stream-processing		08-stream-processing
09-texture-processing		09-texture-processing
10-dynamic-parallelism		10-dynamic-parallelism
11-cuda-libraries		11-cuda-libraries
12-multi-gpu		12-multi-gpu
13-unified-memory		13-unified-memory
14-optimization		14-optimization
15-debugging		15-debugging
git-scripts		git-scripts
.gitignore		.gitignore
README.md		README.md

ucoruh/cen310-cuda-tutorial-demos

Folders and files

Latest commit

History

Repository files navigation

CUDA Programming Tutorials

Repository Structure

Prerequisites

Getting Started

Setting Up Your Environment

Building and Running the Examples

Advanced Run Options

Learning Path

Common Issues and Troubleshooting

CUDA Installation Problems

Build Errors

Runtime Errors

Additional Resources

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages