🏛️ MainData.id

🔍 Your AI-powered companion to explore, query, and make sense of Indonesia’s public data — all in one tool.

✨ What is MainData.id?

MainData.id is a modern web-based platform that helps people search, understand, query, and analyze open government data from Indonesia — without needing to download, clean, or manually write SQL.

💡 Why?

Despite many government open data initiatives, the current user journey is still fragmented:

Search across portals to find the right dataset.
Download multiple CSV/Excel files.
Learn the structure and metadata manually.
Load into a spreadsheet or SQL engine.
Write custom queries and build charts.

MainData.id makes all that just one step: Ask your question, get instant SQL & answers.

👥 Who is it for?

📊 Civil servants working with public dashboards and policy research.
🎓 Students & educators looking to teach or learn data literacy.
🔬 Researchers doing data-driven investigations.
🧑‍💻 Developers building tools using public data.

🧠 Key Features

🔍 Ask in Natural Language
“Berapa jumlah penduduk DKI Jakarta tiap tahun?” → auto-generated SQL.
📚 Smart Dataset Catalog
All datasets are indexed in a vector database so the app knows which data to use — and why.
🧠 RAG + LLM Engine
We use a Retrieval-Augmented Generation approach powered by LiteLLM — so you can plug in OpenAI, Mistral, Anthropic, etc.
⚡ In-browser SQL Engine
DuckDB-WASM runs entirely in your browser. No data leaves your computer. Blazing fast.
🪶 Lightweight by design
No sign-up, no chat history. Just ask, explore, and move forward.

🏗️ Architecture

🔹 Frontend: `Next.js`

User inputs a natural language question.
Sends the question to the backend.
Receives generated SQL and runs it using DuckDB-WASM.
Displays results and charts.

🔹 Backend: `FastAPI` + `LiteLLM`

Receives user input.
Embeds the query and performs semantic search (RAG) in Supabase.
Builds a context-aware prompt.
Calls LLM (via LiteLLM) to generate SQL.
Returns generated SQL back to frontend.

🔹 Supabase

Hosts the vectorized dataset catalog.
Enables RAG for dataset understanding.

📁 Project Structure

opendata-lab/
├── frontend/ # Next.js UI, DuckDB, user interaction
├── backend/ # FastAPI server, RAG engine, LiteLLM
├── docs/ # projects documentation

🧪 Local Development

🚀 Run Frontend

cd frontend
pnpm install
pnpm dev

⚙️ Run Backend

cd backend
uv sync

# copy & edit .env.example into .env
cp .env.example .env

# run db migrations
uv run alembic upgrade head

# seeding dataset catalog with dummy data
uv run scripts/seed_example_data.py

# fetching dataset catalog
uv run scripts/fetch_datasets.py

# for dev
uv run fastapi dev --host 0.0.0.0
# alternative if above doesn't work
uv run uvicorn main:app --host 0.0.0.0 --port 8000 --reload

# for prod
uv run uvicorn main:app --host 0.0.0.0 --port 8000

# run test
uv run pytest

# run test with watch mode (with pytest-watcher package)
uv run ptw .

📬 Roadmap

MVP with NL → SQL + DuckDB
RAG-enhanced query understanding
stream based response to frontend
basic user registration & login, and history management
Chart suggestion + auto-viz
One-click CSV upload for user datasets
Dataset versioning and provenance info

🤝 Contributing

We currently doesn't accept pull requests, but we are very open for ideas & suggestions! We would also really appreaciate if you spread the words about this project.

📃 License

GNU AFFERO GENERAL PUBLIC LICENSE ver 3

💬 Made for Indonesia 🇮🇩

We believe public data should be truly accessible — not just downloadable. Let’s make data a common good, not a technical barrier.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
backend		backend
docs		docs
frontend		frontend
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🏛️ MainData.id

✨ What is MainData.id?

💡 Why?

👥 Who is it for?

🧠 Key Features

🏗️ Architecture

🔹 Frontend: `Next.js`

🔹 Backend: `FastAPI` + `LiteLLM`

🔹 Supabase

📁 Project Structure

🧪 Local Development

🚀 Run Frontend

⚙️ Run Backend

📬 Roadmap

🤝 Contributing

📃 License

💬 Made for Indonesia 🇮🇩

About

Uh oh!

Releases

Packages

Uh oh!

Languages

azophy/maindata-id

Folders and files

Latest commit

History

Repository files navigation

🏛️ MainData.id

✨ What is MainData.id?

💡 Why?

👥 Who is it for?

🧠 Key Features

🏗️ Architecture

🔹 Frontend: Next.js

🔹 Backend: FastAPI + LiteLLM

🔹 Supabase

📁 Project Structure

🧪 Local Development

🚀 Run Frontend

⚙️ Run Backend

📬 Roadmap

🤝 Contributing

📃 License

💬 Made for Indonesia 🇮🇩

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

🔹 Frontend: `Next.js`

🔹 Backend: `FastAPI` + `LiteLLM`

Packages