8000 GitHub - Aravind-11/Introduction-to-CUDA
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Aravind-11/Introduction-to-CUDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

CUDA Thrust Examples & Exercises: Introduction to CUDA

Dive into high-performance parallel computing with this collection of CUDA Thrust examples and hands-on exercises designed to sharpen your CUDA skills.

Overview

This repository isn't just a showcase of code; it's an interactive learning environment. It provides example CUDA samples using NVIDIA's Thrust library, focusing on practical scenarios like particle-based simulations. More importantly, you'll find opportunities to execute the code, experiment, and test your understanding of parallel computing patterns and crucial performance optimization techniques.

Examples & Exercises

Explore the code, then put your knowledge to the test! Each example serves as a basis for exercises designed to reinforce the concepts.

Advanced Performance Tuning

  • optimized_max_displacement.cu: Study an optimized approach using iterators to compute maximum displacement between particle sets. (Exercise: Analyze the memory access patterns.)
  • performance_comparison.cu: Compare naive vs. optimized methods for particle displacement calculations. (Exercise: Benchmark the code with varying data sizes and analyze the performance difference.)

(Note: Specific exercises might be detailed within the code comments or accompanying materials.)

Key Learning Objectives: Performance Optimization

Through actively working with these examples and exercises, you can gain practical experience with vital CUDA optimization techniques, including:

  1. Fused Operations: Understand how transform iterators eliminate the need for temporary memory storage.
  2. Memory Coalescing: Learn to structure data and access patterns for optimal GPU memory bandwidth.
  3. Work Reduction: Practice minimizing redundant computations and memory transfers.
  4. Algorithm Selection: Gain insight into choosing and utilizing Thrust's highly-optimized parallel algorithms effectively.

Requirements

To compile and run the examples and complete the exercises, you'll need:

  • CUDA Toolkit 11.0 or higher
  • A CUDA-capable NVIDIA GPU
  • A C++14 compatible compiler (like g++ or clang++ alongside nvcc)

Building and Running the Exercises

To compile an example and prepare for testing:

# Navigate to the directory containing the source file
nvcc -std=c++14 -o <executable_name> <source_file.cu>
  • Replace <executable_name> with your desired output file name (e.g., thrust_zip_test).
  • Replace <source_file.cu> with the example file you want to compile (e.g., thrust_zip.cu).
  • The -std=c++14 flag ensures compatibility with the required C++ standard.

After compiling, you can run the executable directly from your terminal:

./<executable_name>

Now you're ready to experiment and learn!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0