13+ projects spanning RAG systems, machine learning, NLP, and web apps — all with source code.
FastAPI · FAISS · Qdrant · Groq · Llama-3.3-70B
Modular RAG API built from scratch with FastAPI for document upload and natural language querying. Features switchable vector backends (FAISS/Qdrant), multiple chunking strategies, cross-encoder reranking, and Groq-powered generation with Llama-3.3-70B.
FAISS · Cross-Encoder · Reranking
Comparative study showing why retrieval alone isn't enough. Two-stage retrieval + reranking pipeline with noise injection to stress-test semantic search performance.
Flask · Scikit-learn · SHAP · SQLite
Flask web app for real-time churn prediction with user authentication, admin dashboard, CSV export, and SHAP-powered explainability. Automated retention strategy generator included.
XGBoost · SHAP · Imbalanced Learning · Optuna
Transaction classification on 1M+ Kaggle IEEE-CIS records. Achieved ROC-AUC 0.95 and F1 0.66 on the imbalanced fraud class with Optuna hyperparameter tuning.
LightGBM · Prophet · Feature Engineering
SKU-level supermarket price forecasting on 1.7M+ rows using LightGBM and Prophet with lag, rolling window, and calendar features.
Random Forest · Optuna · SHAP
Competition-aware ML workflow integrating historical sales and competitor pricing. Achieved R² ≈ 0.92 with Random Forest and SHAP explainability for business decisions.
Isolation Forest · LOF · KMeans · DBSCAN
Multi-method anomaly detection pipeline combining model-based, cluster-based, and statistical approaches on fraud and retail datasets with PCA visualization.
sentence-transformers · FAISS · Visualization
Scripts and experiments for exploring text embedding models, visualizing embedding spaces, and benchmarking retrieval quality across different encoder models.
GitHubFAISS · sentence-transformers · Python
End-to-end semantic search implementation using FAISS for fast approximate nearest neighbor search with sentence-transformers for dense embedding generation.
GitHubQdrant · Python · Vector Search
Experiments with Qdrant vector database — collection management, payload filtering, hybrid search, and performance benchmarks for RAG applications.
GitHubPython · LangChain · Benchmarking
Comparative study of text chunking strategies — fixed-size, sentence-based, recursive, and semantic chunking — evaluating their impact on RAG retrieval quality.
GitHubScikit-learn · XGBoost · LightGBM · Optuna
Collection of regression projects across different domains — housing prices, demand forecasting, and insurance costs — with thorough EDA, feature engineering, and evaluation.
GitHubKMeans · DBSCAN · PCA · Silhouette Analysis
Unsupervised learning project exploring clustering algorithms for customer segmentation. Includes PCA for dimensionality reduction and silhouette analysis for optimal cluster selection.
GitHubNo projects found for this filter.