I’m a Software Engineer at PayPal, crafting intelligent, scalable, and privacy-first AI systems — from idea to production.
I engineer GenAI systems that understand your data, automate decision-making, and stay compliant — across development, deployment, and optimization.
- 🧠 GenAI & LLM Systems: Fine-tune and deploy custom pipelines with LLaMA, GPT-4o, Mistral; feedback loops, LoRA, quantization & prompt engineering.
- ⚙️ AI Automation & Bots: Automate real-time decision systems like database QnA bots, regulatory insight engines, and context-aware assistants.
- 🔎 RAG (Retrieval-Augmented Generation): Advanced vectorDB RAG using FAISS/Pinecone, prompt decomposition, caching, and ranking strategies.
- 🛠️ Full-Stack AI Engineering: I do it all — from APIs to Docker to cloud. I create and serve ML APIs, build Docker containers, deploy on cloud, and own MLOps pipelines end-to-end.
Languages:
Python
| TypeScript
| SQL
| Bash
Backend & APIs:
FastAPI
| Flask
| gRPC
| Gunicorn
| Nginx
LLMs & GenAI:
LLaMA
| Mistral
| GPT-4o
| LoRA
| Transformers
| Prompt Engineering
| LangChain
| Custom Fine-tuning
Model Optimization:
Quantization
| Memory Offloading
| LoRA
| TorchServe
| ONNX
Retrieval & DBs:
FAISS
| Pinecone
| PostgreSQL
| Dynamic SQL
| Chroma
| VectorDBs
DevOps / MLOps:
Docker
| Kubernetes
| AWS
| Lambda
| S3
| EC2
| SageMaker
| CloudWatch
MLflow
| Weights & Biases
| Airflow
| GitHub Actions
| Shell Scripting
Infra Tools:
Redis
| Celery
| Kafka
| ElasticSearch
- 🧠 Hallucination detection + explainability in RAG/LLMs
- 🧪 Real-time data agents + multi-hop question answering
- ⚡ Scaling inference on single-GPU + multi-tenant workloads
- 🔍 Evaluation frameworks like RAGAS, TruLens, Promptfoo
"I don’t just deploy models — I deploy intelligence."
— vhx