Self-taught Data Scientist & AI Engineer crafting production-ready ML systems from Kathmandu, Nepal.
I'm a self-taught Data Scientist with deep hands-on experience building end-to-end machine learning solutions across classification, regression, clustering, anomaly detection, and time series forecasting.
What started as curiosity about data turned into a full-stack journey through the ML ecosystem. I've gone from training my first Random Forest to building production-grade RAG systems integrating LLMs like Llama-3.3-70B via Groq โ all without a formal CS degree, driven entirely by self-study and project-based learning.
Currently working as a Data Scientist Intern at CR Equity AI, Inc. (remote, Tallahassee, FL, USA), where I design and implement production-grade RAG pipelines, integrate LLM APIs, and build FastAPI services for document intelligence. My work spans classical ML to cutting-edge AI systems with a strong focus on security, scalability, and real-world deployment.
I've completed 20+ structured ML projects handling large-scale datasets โ 1M+ rows for fraud detection and 1.7M+ rows for time series forecasting. My approach combines rigorous evaluation, model explainability through SHAP values, and reproducible research workflows.
Proper evaluation, CV splits, no data leakage โ results you can trust.
SHAP values and feature importance โ black-box outputs become actionable insights.
Docker, FastAPI, JWT auth, rate limiting โ models that actually ship.
From self-study to production AI systems
CR Equity AI, Inc. ยท Remote (Tallahassee, FL, USA)
Building production-grade RAG pipelines, LLM integrations, and FastAPI services for document intelligence. Collaborating with a US-based engineering team using Agile practices.
Self-directed Research & Projects
Transitioned from classical ML to generative AI. Built RAG systems with FAISS and Qdrant, explored vector embeddings, cross-encoder reranking, and LLM API integration. Developed a production-grade Personal Knowledge Base RAG API.
Independent Research
Published preprint on SHAP-Based Feature Selection and Iterative Hyperparameter Tuning for Customer Churn Prediction in Telecommunication Datasets โ demonstrating interpretability and optimization in production ML models.
Self-directed Learning
Built 20+ end-to-end ML projects spanning classification, regression, time series, clustering, and anomaly detection. Handled 1M+ row datasets, implemented SHAP explainability, and developed web applications around ML models including ChurnShield.
Tribhuvan University ยท Kathmandu, Nepal
Formal education in computer science fundamentals โ algorithms, databases, programming, and software engineering. Applied formal knowledge through independent data science and AI projects running in parallel with academic studies.
Explores interpretability and optimization techniques for churn modeling, demonstrating SHAP values for transparent feature importance and iterative tuning for enhanced model performance on real-world telecom data. Bridges the gap between black-box ML models and actionable business insights.
I'm open to collaborations, freelance work, and exciting opportunities.