- Santa Clara, California, United States
-
11:19
(UTC -07:00) - https://www.linkedin.com/in/jaemincs/
-
NeMo Public
Forked from NVIDIA/NeMoNeMo: a toolkit for conversational AI
Python Apache License 2.0 UpdatedMay 2, 2025 -
TransformerEngine Public
Forked from NVIDIA/TransformerEngineA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
Python Apache License 2.0 UpdatedMar 21, 2025 -
Megatron-LM Public
Forked from NVIDIA/Megatron-LMOngoing research training transformer models at scale
Python Other UpdatedOct 22, 2024 -
apex Public
Forked from NVIDIA/apexA PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Python BSD 3-Clause "New" or "Revised" License UpdatedSep 18, 2024 -
-
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
-
training Public
Forked from mlcommons/trainingReference implementations of MLPerf™ training benchmarks
Python Apache License 2.0 UpdatedOct 16, 2023 -
-
multi-gpu-programming-models Public
Forked from NVIDIA/multi-gpu-programming-modelsExamples demonstrating available options to program multiple GPUs in a single node or a cluster
Cuda BSD 3-Clause "New" or "Revised" License UpdatedJun 1, 2022 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
C++ Other UpdatedAug 10, 2021 -
dlrm Public
Forked from facebookresearch/dlrmAn implementation of a deep learning recommendation model (DLRM)
Python MIT License UpdatedJul 8, 2021 -
ompi Public
Forked from open-mpi/ompiOpen MPI main development repository
C Other UpdatedMay 18, 2021 -
charm Public
Forked from charmplusplus/charmThe Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
-
-
kokkos-tutorials Public
Forked from kokkos/kokkos-tutorialsTutorials for the Kokkos C++ Performance Portability Programming EcoSystem
C++ UpdatedAug 23, 2020 -
gpuroofperf-toolkit Public
Forked from ekondis/gpuroofperf-toolkitA GPU performance prediction toolkit for CUDA programs
Cuda MIT License UpdatedJan 28, 2020 -
-
miniFE Public
Forked from Mantevo/miniFEMiniFE Finite Element Mini-Application
C++ GNU Lesser General Public License v3.0 UpdatedJan 24, 2020 -
codes Public
Forked from codes-org/codesThe Co-Design of Exascale Storage Architectures (CODES) simulation framework builds upon the ROSS parallel discrete event simulation engine to provide high-performance simulation utilities and mode…
C Other UpdatedJan 24, 2020 -
-
-
-
gpu Public
Contains pieces of GPU related research that are too small to warrant a separate repository.
C UpdatedDec 3, 2019 -
sst-dumpi Public
Forked from sstsimulator/sst-dumpiSST DUMPI Trace Library
C Other UpdatedOct 13, 2019 -
baseenv Public
A fork of Bill Gropp's baseenv (http://wgropp.cs.illinois.edu/projects/software/baseenv.htm)
-
dumpi-cortex Public
A fork of https://xgitlab.cels.anl.gov/mdorier/dumpi-cortex
C++ Other UpdatedSep 25, 2019 -
TraceR Public
Forked from hpcgroup/TraceRTrace Replay and Network Simulation Framework
C MIT License UpdatedSep 25, 2019 -
-
changa Public
Forked from N-BodyShop/changaMirror of UIUC/PPL version of ChaNGa
C++ GNU General Public License v2.0 UpdatedSep 24, 2019 -
sw4lite Public
Forked from geodynamics/sw4liteTesting numerical kernels in SW4
C Other UpdatedAug 16, 2019