猎户座14B

🌐英语| 🇨🇳中文| 🇯🇵日本语| 🇰🇷한국어

🤗 HuggingFace 主页| 🤖 ModelScope 主页| 🧰 OpenXLab 主页
🎬 HuggingFace 演示| 🎫 ModelScope 演示
😺 GitHub
📖技术报告

1. 型号介绍

Orion-14B系列模型是OrionStarAI从零开始训练的开源多语言大语言模型。基础模型在2.5T多语言语料库上训练，包括中文、英语、日语、韩语等，在这些语言上表现出优越的性能。详情请参阅技术报告。
Orion-14B系列型号具有以下特点：
- 在20B参数尺度级别的模型中，Orion-14B-Base模型在综合评价中表现突出。
- 强大的多语言能力，在日语和韩语测试集上表现明显优于。
- 经过微调的模型表现出很强的适应性，在人工注释的盲测中表现出色。
- 长聊天版本支持极长的文本，在 200k 的 token 长度下表现异常出色，最大可支持 320k。
- 量化版本将模型大小减少了 70%，推理速度提高了 30%，性能损失不到 1%。

Orion-14B 系列型号包括：
- Orion-14B-Base： 具有 140 亿个参数的多语言大型语言基础模型，在包含 2.5 万亿个令牌的多样化数据集上进行了预训练。
- Orion-14B-Chat： 在高质量语料库上微调的聊天模型，旨在为大型模型社区的用户提供卓越的交互体验。
- Orion-14B-LongChat： 长上下文版本擅长处理极长的文本，在 200k 的令牌长度下表现出色，最多可支持 320k。
- Orion-14B-Chat-RAG： 在自定义检索增强生成数据集上进行微调的聊天模型，在检索增强生成任务中实现卓越的性能。
- Orion-14B-Chat-Plugin： 专为插件和函数调用任务量身定制的聊天模型，非常适合 LLM 作为插件和函数调用系统的代理相关场景。
- Orion-14B-Base-Int4： 利用 4 位整数权重的量化基本模型。它使模型大小显着减小了 70%，推理速度提高了 30%，同时性能损失最小，仅为 1%。
- Orion-14B-Chat-Int4： 利用 4 位整数权重的量化聊天模型。

2. 模型下载

下表提供了模型发布和下载链接：

型号名称	HuggingFace 下载链接	模型范围下载链接	OpenXLab 下载链接
⚾Orion-14B-基地	Orion-14B-基地	Orion-14B-基地	Orion-14B-基地
😛Orion-14B-聊天	Orion-14B-聊天	Orion-14B-聊天	Orion-14B-聊天
📃Orion-14B-LongChat	Orion-14B-LongChat	Orion-14B-LongChat	Orion-14B-LongChat
🔎Orion-14B-聊天-RAG	Orion-14B-聊天-RAG	Orion-14B-聊天-RAG	Orion-14B-聊天-RAG
🔌Orion-14B-聊天插件	Orion-14B-聊天插件	Orion-14B-聊天插件	Orion-14B-聊天插件
💼Orion-14B-Base-Int4	Orion-14B-Base-Int4	Orion-14B-Base-Int4	Orion-14B-Base-Int4
📦Orion-14B-聊天-Int4	Orion-14B-聊天-Int4	Orion-14B-聊天-Int4	Orion-14B-聊天-Int4

3. 模型基准

3.1.基本模型 Orion-14B-Base 基准

3.1.1. LLM考试及专业知识评估结果

模型	C-评估	CMLU	MMLU	AGIE值	高考	BBH
LLaMA2-13B	41.4	38.4	55.0	30.9	18.2	45.6
Skywork-13B	59.1	61.4	62.7	43.6	56.1	48.3
百川2-13B	59.0	61.3	59.5	37.4	45.6	49.0
QWEN-14B	71.7	70.2	67.9	51.9	62.5	53.7
实习生LM-20B	58.8	59.0	62.1	44.6	45.5	52.5
Orion-14B-基地	72.9	70.6	69.9	54.7	62.1	56.5

3.1.2. LLM语言理解和常识评估结果

模型	RACE-中	种族高	海拉斯瓦格	PIQA	兰巴达	世界科学委员会
美洲驼2-13B	63.0	58.9	77.5	79.8	76.5	66.3
Skywork-13B	87.6	84.1	73.7	78.3	71.8	66.3
百川2-13B	68.9	67.2	70.8	78.1	74.1	66.3
QWEN-14B	93.0	90.3	80.2	79.8	71.4	66.3
实习生LM-20B	86.4	83.3	78.1	80.3	71.8	68.3
Orion-14B-基地	93.2	91.3	78.5	79.5	78.8	70.2

3.1.3. OpenCompass 测试集的 LLM 评估结果

模型	平均的	考试	语言	知识	理解	推理
美洲驼2-13B	47.3	45.2	47.0	58.3	50.9	43.6
Skywork-13B	53.6	61.1	51.3	52.7	64.5	45.2
百川2-13B	49.4	51.8	47.5	48.9	58.1	44.2
QWEN-14B	62.4	71.3	52.67	56.1	68.8	60.1
实习生LM-20B	59.4	62.5	55.0	60.1	67.3	54.9
Orion-14B-基地	64.3	71.4	55.0	60.0	71.9	61.6

3.1.4. LLM 在日本测试集上的表现比较

模型	平均的	日本质量保证协会	JNLI	机读目录	JSQD	金青康	XLS	新万网	移动全球移动通信系统
PLAMo-13B	52.3	56.7	42.8	95.8	70.6	71.0	8.70	70.5	2.40
网络实验室-10B	50.7	66.6	53.7	82.1	62.9	56.2	10.0	72.0	2.40
ELYZA-jp-7B	48.8	71.7	25.3	86.6	70.8	64.1	2.50	62.1	7.20
稳定LM-jp-7B	51.1	33.4	43.3	96.7	70.6	78.1	10.7	72.8	2.80
美洲驼2-13B	46.3	75.0	47.6	38.8	76.1	67.7	18.1	63.2	10.4
百川2-13B	57.1	73.7	31.3	91.6	80.5	63.3	18.6	72.2	25.2
QWEN-14B	65.8	85.9	60.7	97.0	83.3	71.8	18.8	70.6	38.0
翼34B	67.1	83.8	61.2	95.2	86.1	78.5	27.2	69.2	35.2
Orion-14B-基地	69.1	88.2	75.8	94.1	75.7	85.1	17.3	78.8	38.0

3.1.5。 LLM 在韩国测试集上的表现比较。 n = 0 和 n = 5 代表评估中使用的 n 次提示

模型	平均 n=0 n=5	海拉斯瓦格 n=0 n=5	COPA n=0 n=5	BooIQ n=0 n=5	SentiNeg n=0 n=5
KoGPT	53.0 70.1	55.9 58.3	73.5 72.9	45.1 59.8	37.5 89.4
多语言-ko-13B	69.6 73.7	59.5 63.1	79.4 81.1	48.2 60.4	91.2 90.2
美洲驼2-13B	46.7 63.7	41.3 44.0	59.3 63.8	34.9 73.8	51.5 73.4
百川2-13B	52.1 58.7	39.2 39.6	60.6 60.6	58.4 61.5	50.3 72.9
QWEN-14B	53.8 73.7	45.3 46.8	64.9 68.9	33.4 83.5	71.5 95.7
翼34B	54.2 72.1	44.6 44.7	58.0 60.6	65.9 90.2	48.3 92.9
Orion-14B-聊天	74.5 79.6	47.0 49.6	77.7 79.4	81.6 90.7	92.4 98.7

3.1.6。多语言评估

模型	火车朗	日本人	韩国人	中国人	英语
PLAMo-13B	英语、日语	52.3	*	*	*
网络实验室-10B	英语、日语	50.7	*	*	*
ELYZA-jp-7B	英语、日语	48.8	*	*	*
稳定LM-jp-7B	英语、日语	51.1	*	*	*
KoGPT-6B	恩,科	*	70.1	*	*
多语言-ko-13B	恩,科	*	70.7	*	*
百川2-13B	多	57.1	58.7	50.8	57.1
Qwen-14B	多	65.8	73.7	64.5	65.4
骆驼2-13B	多	46.3	63.7	41.4	55.3
翼34B	多	67.1	72.2	58.7	68.8
Orion-14B-聊天	多	69.1	79.5	67.9	67.3

3.2.聊天模型 Orion-14B-聊天基准

3.2.1. MTBench 聊天模型主观评价

模型	第一回合	第二回合	平均的
百川2-13B-聊天	7.05	6.47	6.76
Qwen-14B-聊天	7.30	6.62	6.96
Llama2-13B-聊天	7.10	6.20	6.65
实习生LM-20B-聊天	7.03	5.93	6.48
Orion-14B-聊天	7.68	7.07	7.37
* 使用vllm进行推理

3.2.2.聊天模型对AlignBench的主观评价

模型	数学。	洛吉。	基本的。	驰。	比较。	令状。	角色。	教授。	平均。
百川2-13B-聊天	3.76	4.07	6.22	6.05	7.11	6.97	6.75	6.43	5.25
Qwen-14B-聊天	4.91	4.71	6.90	6.36	6.74	6.64	6.59	6.56	5.72
Llama2-13B-聊天	3.05	3.79	5.43	4.40	6.76	6.63	6.99	5.65	4.70
实习生LM-20B-聊天	3.39	3.92	5.96	5.50	7.18	6.19	6.49	6.22	4.96
Orion-14B-聊天	4.00	4.24	6.18	6.57	7.16	7.36	7.16	6.99	5.51
* 使用vllm进行推理

3.3. LongChat 模型 Orion-14B-LongChat 基准

3.3.1. LongChat对LongBench的评测

模型	叙事质量保证	多领域质量保证	多领域QA-zh	读读器	质量总和	VCSUM	特雷克	问答问答	LSHT	RepoBench-P
GPT-3.5-Turbo-16k	23.60	52.30	61.20	28.70	23.40	16:00	68.00	91.40	29.20	53.60
LongChat-v1.5-7B-32k	16.90	41.40	29.10	19.50	22.70	9.90	63.50	82.30	23.20	55.30
骆驼毛-v1.5-7B-16k	19.40	38.50	43.00	19.30	22.80	15.10	71.50	86.20	28.80	43.50
毅6B-200K	14.11	36.74	22.68	14.01	20.44	8.08	72.00	86.61	38:00	63.29
Orion-14B-LongChat	19.47	48.11	55.84	37.02	24.87	15.44	77.00	89.12	45.50	54.31

3.4. Chat RAG 模型基准

3.4.1.自建RAG测试集的LLM评估结果

模型	响应有效性（关键词）	*响应有效性（主观评价）	报价能力	后备能力	*自动质量检查	*数据提取
百川2-13B-聊天	85	76	1	0	69	51
Qwen-14B-聊天	79	77	75	47	68	72
Qwen-72B-聊天(Int4)	87	89	90	32	67	76
GPT-4	91	94	96	95	75	86
Orion-14B-聊天-RAG	86	87	91	97	73	71
* 表示人工评估

3.5.聊天插件模型 Orion-14B-Chat-Plugin 基准

3.5.1.自建插件测试集的LLM评估结果

模型	具有完整参数的意图识别	缺少参数的意图识别	非插件调用识别
百川2-13B-聊天	25	0	0
Qwen-14B-聊天	55	0	50
GPT-4	95	52.38	70
Orion-14B-聊天插件	92.5	60.32	90

3.6.量化模型 Or 8000 ion-14B-Base-Int4 基准

3.6.1.量化前后对比

模型	尺寸(GB)	推理速度(tokens/s)	C-评估	CMLU	MMLU	种族	海拉斯瓦格
OrionStar-14B-基地	28.0	135	72.8	70.6	70.0	93.3	78.5
OrionStar-14B-Base-Int4	8.3	178	71.8	69.8	69.2	93.1	78.0

4. 模型推理

模型权重、源代码和推理所需的配置已发布在 Hugging Face 上，下载链接可在本文开头的表格中找到。我们在这里演示了各种推理方法，程序会自动从Hugging Face下载必要的资源。

4.1. Python代码

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation.utils import GenerationConfig
tokenizer = AutoTokenizer.from_pretrained("OrionStarAI/Orion-14B", use_fast=False, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("OrionStarAI/Orion-14B", device_map="auto",
torch_dtype=torch.bfloat16, trust_remote_code=True)
model.generation_config = GenerationConfig.from_pretrained("OrionStarAI/Orion-14B")
messages = [{"role": "user", "content": "Hello, what is your name? "}]
response = model.chat(tokenizer, messages, streaming=False)
print(response)

<clipboard-copy aria-label="Copy" class="ClipboardButton btn btn-invisible js-clipboard-copy m-2 p-0 tooltipped-no-delay d-flex flex-justify-center flex-items-center" data-copy-feedback="Copied!" data-tooltip-direction="w" value="import torch from transformers import AutoModelForCausalLM, AutoTokenizer from transformers.generation.utils import GenerationConfig

tokenizer = AutoTokenizer.from_pretrained("OrionStarAI/Orion-14B", use_fast=False, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("OrionStarAI/Orion-14B", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True)

model.generation_config = GenerationConfig.from_pretrained("OrionStarAI/Orion-14B") messages = [{"role": "user", "content": "Hello, what is your name? "}] response = model.chat(tokenizer, messages, streaming=False) print(response) " tabindex="0" role="button">

在上面的 Python 代码中，加载模型以device_map='auto'利用所有可用的 GPU。要指定设备，您可以使用类似的内容export CUDA_VISIBLE_DEVICES=0,1（使用 GPU 0 和 1）。

4.2.命令行工具

CUDA_VISIBLE_DEVICES=0 python cli_demo.py

该命令行工具是为聊天场景设计的，因此不支持调用基础模型。

4.3.直接脚本推理

# base model CUDA_VISIBLE_DEVICES=0 python demo/text_generation_base.py --model OrionStarAI/Orion-14B --tokenizer OrionStarAI/Orion-14B --prompt hello

# chat model CUDA_VISIBLE_DEVICES=0 python demo/text_generation.py --model OrionStarAI/Orion-14B-Chat --tokenizer OrionStarAI/Orion-14B-Chat --prompt hi

  <svg aria-hidden="true" height="16" viewBox="0 0 16 16" version="1.1" width="16" data-view-component="true" class="octicon octicon-copy js-clipboard-copy-icon">
<path d="M0 6.75C0 5.784.784 5 1.75 5h1.5a.75.75 0 0 1 0 1.5h-1.5a.25.25 0 0 0-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 0 0 .25-.25v-1.5a.75.75 0 0 1 1.5 0v1.5A1.75 1.75 0 0 1 9.25 16h-7.5A1.75 1.75 0 0 1 0 14.25Z"></path><path d="M5 1.75C5 .784 5.784 0 6.75 0h7.5C15.216 0 16 .784 16 1.75v7.5A1.75 1.75 0 0 1 14.25 11h-7.5A1.75 1.75 0 0 1 5 9.25Zm1.75-.25a.25.25 0 0 0-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 0 0 .25-.25v-7.5a.25.25 0 0 0-.25-.25Z"></path>

4.4.通过 vLLM 进行推理(>=v0.3.3)

vLLM 网址和快速入门
https://github.com/vllm-project/vllm
https://docs.vllm.ai/en/latest/getting_started/quickstart.html
启动vllm服务器：

python -m vllm.entrypoints.openai.api_server --model OrionStarAI/Orion-14B-Chat

4.5.由 llama.cpp 推断

项目网址
https://github.com/ggerganov/llama.cpp
Orion-14B-Chat.gguf
https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/Orion-14B-Chat.gguf

如何转换为GGUF模型

python convert-hf-to-gguf.py path/to/Orion-14B-Chat --outfile chat.gguf

如何运行一代

./main --frequency-penalty 0.5 --frequency-penalty 0.5 --top-k 5 --top-p 0.9 -m chat.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e

4.6 输出示例

4.6.1.休闲聊天

User: Hello
Orion-14B: Hello! How can I help you today? If you have any questions or need assistance, feel free to ask.
User: What functions do you have?
Orion-14B: As an AI language model, I have a wide range of functions and capabilities, including but not limited to:
  Answering general knowledge questions: I can provide information on various topics such as history, science, geography, and more.
  Providing definitions and explanations: I can explain complex concepts and provide definitions for words and phrases.
  Assisting with tasks: I can help with simple tasks like setting reminders, converting units of measurement, and even ordering food from restaurants.
  Generating text: I can write short stories, essays, articles, and other types of content based on prompts given to me.
  Translating languages: I can translate text from one language to another.
  Chatting: I can engage in conversation with users and respond to their queries in a conversational manner.
  Summarizing long texts: I can summarize longer texts like articles, books, and research papers.
  Generating random numbers and names: I can generate random numbers within a specified range and also suggest names based on certain criteria.
  Playing games: I can play simple games like Hangman, Trivia, and more.
  Providing recommendations: I can recommend movies, music, books, and other forms of entertainment based on user preferences.
  These are just some of the many functions that I am capable of performing. If you have a specific task in mind, feel free to ask me about it.
User: Tell me a joke.
Orion-14B: Sure, here's a classic one-liner: Why don't scientists trust atoms? Because they make up everything.

4.6.2.日语和韩语聊天

User：自己を紹介してください
Orion-14B：こんにちは、私の名前はChatMaxで、OrionStarによって開発されたAIアシスタントです。私はユーザーに正確で、客観的で、有用な回答を提供し、彼らが問題を解決し、情報を取得するのを助けることができます。
User：자기소개를 해주세요.
Orion-14B：안녕하세요, 제 이름은 ChatMax입니다. OrionStar에서 개발한 AI 어시스턴트입니다. 저는 사용자들에게 정확하고 객관적이며 유용한 답변을 제공하여 문제를 해결하고 정보를 얻는 데 도움을 줄 수 있습니다.

5. 声明、许可

5.1.声明

我们强烈呼吁所有用户不要利用Orion-14B模型进行任何危害国家安全、社会安全或违法的活动。此外，我们要求用户在未经适当的安全审查和备案的情况下，不要使用 Orion-14B 模型提供互联网服务。我们希望所有用户都遵守这一原则，以确保技术发展在规范、合法的环境中进行。我们已尽力确保模型训练过程中使用的数据的合规性。然而，尽管我们付出了巨大的努力，由于模型和数据的复杂性，仍然可能会出现不可预见的问题。因此，如果由于Orion-14B开源模型的使用而出现任何问题，包括但不限于数据安全问题、舆情风险，或模型被误导、滥用、传播或传播而产生的任何风险和问题，如使用不当，我们将不承担任何责任。

5.2.执照

Orion-14B 系列型号的社区使用

代码请遵守 Apache License Version 2.0
型号请遵守【Orion-14B系列】型号社区许可协议

六、公司简介

OrionStar是全球领先的服务机器人解决方案公司，成立于2016年9月。OrionStar致力于利用人工智能技术打造下一代革命性机器人，让人们从重复的体力劳动中解放出来，让人类的工作和生活更加智能化和智能化。令人愉快的。 OrionStar 旨在通过技术让社会和世界变得更美好。

OrionStar拥有完全自主研发的语音交互、视觉导航等端到端人工智能技术。集产品开发能力和技术应用能力于一体。基于Orion机械臂平台，推出了OrionStar人工智能问候机器人、AI机器人问候Mini、Lucki、咖啡大师等产品，并建立了Orion机器人开放平台OrionOS。 OrionStar秉承“为真正有用的机器人而生”的理念，通过AI技术为更多人赋能。

OrionStar的核心优势在于拥有端到端的人工智能应用能力，包括大数据预处理、大模型预训练、微调、即时工程、代理等。拥有全面的端到端模型训练能力，包括系统化的模型训练能力。数据处理工作流程和数百块GPU的并行模型训练能力，已成功应用于政务、云服务、国际电商、快消品等多种行业场景。

欢迎有部署大规模模型应用需求的企业联系我们。
咨询热线：400-898-7779
邮箱：ai@orionstar.com
Discord 链接：https://discord.gg/zumjDWgdAs

Name		Name	Last commit message	Last commit date
Latest commit History 236 Commits
assets/imgs		assets/imgs
demo		demo
doc		doc
gradio_demo		gradio_demo
quantization		quantization
.gitignore		.gitignore
8000 LICENSE		LICENSE
ModelsCommunityLicenseAgreement		ModelsCommunityLicenseAgreement
README.md		README.md
README_ja.md		README_ja.md
README_ko.md		README_ko.md
README_zh.md		README_zh.md

License

yuanxiaoming8899/Orion

Folders and files

Latest commit

History

Repository files navigation