Hello, I'm Pavan Pandya. I'm driven by curosity and a passion for solving real world problems through technology.
About me
My journey with technology started with a simple curiosity—one that grew into a lifelong passion. Movies like "The Internship" and "Iron Man" fueled my dream of creating something extraordinary, like JARVIS. Today, I’m pursuing a Master’s in Computer Science and building AI applications that push the boundaries of what's possible. My work is driven by a desire to solve complex puzzles, innovate, and use technology to make life better. Collaboration is key to my approach, and I’m always excited to connect with others who share my passion.
When I'm not coding, you’ll find me experimenting in the kitchen or taking long walks to clear my head. Those moments help me reconnect, find inspiration, & often lead to my best idea.
My projects
MediLink - Patient and Insurance Management System
As a full-stack developer, I developed a healthcare management platform enabling patients, doctors, and insurance providers to efficiently manage appointments, medical records, and insurance plans.
- Django
- Next.js
- Render
- PostgreSQL
- Docker
DocQnA – Intelligent PDF Querying LLM System
I led the creation of a PDF-based question-answering system using retrieval-augmented generation (RAG), integrating Apache Cassandra for data management and Streamlit for a user-friendly interface.
- Apache Cassandra
- Astra DB
- LangChain
- Streamlit
- Hugging Face
Sales & Customer Data Analysis Dashboard
I developed Tableau dashboards for sales and customer analysis, enhancing trend identification and customer segmentation with interactive, data-driven insights.
- Tableau
- ETL
- SQL
- Data Collection
Kidney Disease Classification
I improved kidney disease classification accuracy with a deep-learning model and streamlined the deployment process using AWS, Docker and creating an efficient CI/CD pipeline.
- Python
- Deep Learning
- MLFlow
- DVC
- AWS
- Docker
- GitHub Actions
Unveiling Trends - A Cloud-Driven Data Engineering Project
I created a custom YouTube data scraper and built interactive QuickSight dashboards to analyze and visualize trending topics, supporting informed decision-making.
- AWS
- S3
- Glue
- Lambda
- Athena
- QuickSight
- Python
Uber Data Analysis Pipeline using GCP
I developed a GCP data pipeline to analyze NYC taxi trip data, enhancing processing efficiency and operational effectiveness through insightful visualizations with Looker.
- Python
- BigQuery
- Data Extraction
- Data Transformation
- MageAI
- Looker
My skills
- Python
- R
- C
- C++
- Node.js
- Express.js
- Django
- Flask
- HTML
- CSS
- JavaScript
- RESTful APIs
- Tailwind CSS
- Streamlit
- PyTorch
- TensorFlow
- Keras
- Langchain
- Scikit-learn
- Pandas
- NumPy
- NLTK
- Matplotlib
- Seaborn
- Selenium
- Huggingface
- Regex
- SQL
- PostgreSQL
- MySQL
- SQLite
- NoSQL
- MongoDB
- Apache Cassandra
- DynamoDB
- Redis
- AWS
- Azure
- Docker
- GitHub Actions
- CI/CD
- Kubernetes
- Git
- Linux
- Elasticsearch
- Hadoop
- Apache Spark
- Postman
- Tableau
- Render
- Vercel
- Jira
- Agile
- Scrum
- SDLC
- Unit Testing
- Integration Testing
My experience
Marchup Inc.
Software Developer
San Jose, CA
Enabled real-time interaction for high school students and parents by engineering a Flask microservice as middleware between the PHP backend and Azure OpenAI, providing personalized guidance for university admissions and career paths. Enhanced user experience by achieving instantaneous AI response (<1 second) through developing an async polling mechanism that fetches and processes new messages every second, improving interaction efficiency. Streamlined deployment efficiency by 50% by Dockerizing and deploying the chatbot application on AWS ECS Fargate, allowing the system to handle hundreds of daily queries with seamless uptime and faster updates. Boosted recommendation accuracy and response time by 20% by integrating Elasticsearch, which used cosine similarity to efficiently retrieve user recommendations based on message and profile embeddings, improving personalized interactions.
May 2024 - PresentDhirubhai Ambani Institute of Information and Communication Technology
Data Science Intern
Gandhinagar, India
Analyzed inflation-related sentiment by collecting and processing 1.5M+ tweets via Twitter API and Selenium. Improved sentiment analysis depth by 30% by applying advanced text preprocessing and BERT embeddings to analyze 157K inflation-specific tweets, resulting in more nuanced understanding of public sentiment. Optimized forecast accuracy by 15% by developing machine learning models with BERTopic and LDA, using manual annotation of 300 tweets and hyperparameter tuning to reveal predictive relationships between sentiment and inflation trends.
Nov 2022 - June 2023Hate Speech and Offensive Content Identification
Data Science Intern
Ahmedabad, India
Increased annotation throughput by 40% by engineering a REST API backend for a submission platform, enabling seamless participant submissions and dashboard filtering by accuracy and task categories. Refined model accuracy by 15% by leading the collection of 1,200 Gujarati tweets for hate speech detection, using Selenium for data gathering, manual annotation of 200 tweets, and fine-tuning with few-shot learning techniques.
June 2021 - June 2023Brainly Beam Technologies Pvt Ltd.
Data Science Intern
Ahmedabad, India
Achieved 78% accuracy in sentiment identification by developing a recommender system using linguistics and contextual approaches with SVM and Bayes classifiers. Deepened content understanding by analyzing and predicting sentiments in reviews and comments using NLP methodologies, leveraging RNN and LSTM models to extract actionable insights.
June 2022 - July 2022Contact me
Please contact me directly at pavanpandya.iu@gmail.com or through this form.