8000 GitHub - jatali/KuiperLLama_yes: 动手实现大模型推理框架

More Web Proxy on the site http://driver.im/

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
cmake		cmake
demo		demo
imgs		imgs
kuiper		kuiper
test		test
.clang-format		.clang-format
CMakeLists.txt		CMakeLists.txt
readme.md		readme.md

Repository files navigation

自制大模型推理框架

🙋🙋🙋 自制大模型推理框架，火热进行中，请加下方微信了解。

带你从零写一个支持LLama推理，支持Cuda加速的大模型框架

课程目录

第三方依赖

google glog https://github.com/google/glog
google gtest https://github.com/google/googletest
sentencepiece https://github.com/google/sentencepiece
armadillo + openblas https://arma.sourceforge.net/download.html

openblas作为armadillo的后端数学库，加速矩阵乘法等操作，也可以选用Intel-MKL

模型下载地址

llama2 https://pan.baidu.com/s/1PF5KqvIvNFR8yDIY1HmTYA?pwd=ma8r 或 https://huggingface.co/fushenshen/lession_model/tree/main
- tinyllama模型 https://huggingface.co/karpathy/tinylla 57EC mas/tree/main
- tinyllama分词器 https://huggingface.co/yahma/llama-7b-hf/blob/main/tokenizer.model

编译方法

  # 假设已经装好上述的第三方依赖
  mkdir build 
  cd build
  cmake ..
  make -j16

生成文本的方法

./llama_infer llama2_7b.bin tokenizer.model

About

动手实现大模型推理框架

Report repository

Releases

No releases published

Packages

No packages published

Languages

C++ 87.2%
Cuda 9.3%
CMake 3.5%

0