Stars
GraphVite: A General and High-performance Graph Embedding System
Local models support for Microsoft's graphrag using ollama (llama3, mistral, gemma2 phi3)- LLM & Embedding extraction
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
基于LangChain和ChatGLM-6B等系列LLM的针对本地知识库的自动问答
A powerful tool for creating fine-tuning datasets for LLM
DataX集成可视化页面,选择数据源即可一键生成数据同步任务,支持RDBMS、Hive、HBase、ClickHouse、MongoDB等数据源,批量创建RDBMS数据同步任务,集成开源调度系统,支持分布式、增量同步数据、实时查看运行日志、监控执行器资源、KILL运行进程、数据源信息加密等。
Code for PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…
Fully open data curation for reasoning models
Fully open reproduction of DeepSeek-R1
WildEval / ZeroEval
Forked from allenai/WildBenchA simple unified framework for evaluating LLMs
Official implementation of Adaptive Feature Transfer (AFT)
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Netease Youdao's open-source embedding and reranker models for RAG products.
Knowledge Graph Large Language Model (KG-LLM)
Aligning pretrained language models with instruction data generated by themselves.
ChatGPT提问技巧
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A Comprehensive Toolkit for High-Quality PDF Content Extraction
这个项目是一个Jupyter notebook的集合,专门用于学习和探索LangChain框架。
🦜🔗 Build context-aware reasoning applications
一个在线的微信公众号文章批量下载工具,支持下载阅读量与评论数据,支持私有化部署,通过浏览器进行使用,无需进行安装
"奇伢爬虫"是基于sprint boot 、 WebMagic 实现 微信公众号文章、新闻、csdn、info等网站文章爬取,可以动态设置文章爬取规则、清洗规则,基本实现了爬取大部分网站的文章。
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
A wikipedia search engine that is completely built in Java and works on Wikipedia XML dumps