Using open source AI models with tool calling capabilities, it is possible to write an intelligent program that can do agentic functions. In this project, I am trying to create a personalized experience with the use of artifical intelligence driven by langchain and other agentic ai frameworks.
- Streaming chat output to the web server with websocket
Customize and bring open source models to the world. Enable private use of AI with secure information.
- Data persistence via threads.
- Data streaming for real time response.
- Backend ai server (langchain, fastapi)
- Web Server (Next JS)
- Ollama and local models (Running in OS)
- Docker (for server)
For easy type hinting or running notebooks in local computer, add an env and install requirements
#project root folder
python -m venv env
pip install --upgrade pip
pip install -r requirements.txt
docker compose up -d
## or docker compose up --build for first time
Hooked up our frontend with Next JS so that we can get the best of React.
API can be developed like this via fast api
POST http://localhost:8000/chat
{
"prompt": "Tell me a joke"
}
{
"response": "Here's one:\n\nWhat do you call a fake noodle?\n\nAn impasta.\n\n(I hope that made your day!) Do want to hear another? I have plenty of them! :) )"
}
With Llama, mistral and qwen, tool calling is possible. See scripts/tool_calling.py
for more details. It is a test tool that returns static result, and the model is able to handle it.