👋

Hello, I'm Pavan Pandya. I'm driven by curosity and a passion for solving real world problems through technology.

About me

My journey with technology started with a simple curiosity—one that grew into a lifelong passion. Movies like "The Internship" and "Iron Man" fueled my dream of creating something extraordinary, like JARVIS. Today, I’m pursuing a Master’s in Computer Science and building AI applications that push the boundaries of what's possible. My work is driven by a desire to solve complex puzzles, innovate, and use technology to make life better. Collaboration is key to my approach, and I’m always excited to connect with others who share my passion.

When I'm not coding, you’ll find me experimenting in the kitchen or taking long walks to clear my head. Those moments help me reconnect, find inspiration, & often lead to my best idea.

My projects

MediLink - Patient and Insurance Management System

As a full-stack developer, I developed a healthcare management platform enabling patients, doctors, and insurance providers to efficiently manage appointments, medical records, and insurance plans.

Django
Next.js
Render
PostgreSQL
Docker

DocQnA – Intelligent PDF Querying LLM System

I led the creation of a PDF-based question-answering system using retrieval-augmented generation (RAG), integrating Apache Cassandra for data management and Streamlit for a user-friendly interface.

Apache Cassandra
Astra DB
LangChain
Streamlit
Hugging Face

Sales & Customer Data Analysis Dashboard

I developed Tableau dashboards for sales and customer analysis, enhancing trend identification and customer segmentation with interactive, data-driven insights.

Tableau
ETL
SQL
Data Collection

Kidney Disease Classification

I improved kidney disease classification accuracy with a deep-learning model and streamlined the deployment process using AWS, Docker and creating an efficient CI/CD pipeline.

Python
Deep Learning
MLFlow
DVC
AWS
Docker
GitHub Actions

Unveiling Trends - A Cloud-Driven Data Engineering Project

I created a custom YouTube data scraper and built interactive QuickSight dashboards to analyze and visualize trending topics, supporting informed decision-making.

AWS
S3
Glue
Lambda
Athena
QuickSight
Python

Uber Data Analysis Pipeline using GCP

I developed a GCP data pipeline to analyze NYC taxi trip data, enhancing processing efficiency and operational effectiveness through insightful visualizations with Looker.

Python
BigQuery
Data Extraction
Data Transformation
MageAI
Looker

My skills

Python
R
C
C++
Node.js
Express.js
Django
Flask
HTML
CSS
JavaScript
RESTful APIs
Tailwind CSS
Streamlit
PyTorch
TensorFlow
Keras
Langchain
Scikit-learn
Pandas
NumPy
NLTK
Matplotlib
Seaborn
Selenium
Huggingface
Regex
SQL
PostgreSQL
MySQL
SQLite
NoSQL
MongoDB
Apache Cassandra
DynamoDB
Redis
AWS
Azure
Docker
GitHub Actions
CI/CD
Kubernetes
Git
Linux
Elasticsearch
Hadoop
Apache Spark
Postman
Tableau
Render
Vercel
Jira
Agile
Scrum
SDLC
Unit Testing
Integration Testing

My experience

Marchup Inc.

Software Developer

San Jose, CA

Enabled real-time interaction for high school students and parents by engineering a Flask microservice as middleware between the PHP backend and Azure OpenAI, providing personalized guidance for university admissions and career paths. Enhanced user experience by achieving instantaneous AI response (<1 second) through developing an async polling mechanism that fetches and processes new messages every second, improving interaction efficiency. Streamlined deployment efficiency by 50% by Dockerizing and deploying the chatbot application on AWS ECS Fargate, allowing the system to handle hundreds of daily queries with seamless uptime and faster updates. Boosted recommendation accuracy and response time by 20% by integrating Elasticsearch, which used cosine similarity to efficiently retrieve user recommendations based on message and profile embeddings, improving personalized interactions.

May 2024 - Present

Dhirubhai Ambani Institute of Information and Communication Technology

Data Science Intern

Gandhinagar, India

Analyzed inflation-related sentiment by collecting and processing 1.5M+ tweets via Twitter API and Selenium. Improved sentiment analysis depth by 30% by applying advanced text preprocessing and BERT embeddings to analyze 157K inflation-specific tweets, resulting in more nuanced understanding of public sentiment. Optimized forecast accuracy by 15% by developing machine learning models with BERTopic and LDA, using manual annotation of 300 tweets and hyperparameter tuning to reveal predictive relationships between sentiment and inflation trends.

Nov 2022 - June 2023

Hate Speech and Offensive Content Identification

Data Science Intern

Ahmedabad, India

Increased annotation throughput by 40% by engineering a REST API backend for a submission platform, enabling seamless participant submissions and dashboard filtering by accuracy and task categories. Refined model accuracy by 15% by leading the collection of 1,200 Gujarati tweets for hate speech detection, using Selenium for data gathering, manual annotation of 200 tweets, and fine-tuning with few-shot learning techniques.

June 2021 - June 2023

Brainly Beam Technologies Pvt Ltd.

Data Science Intern

Ahmedabad, India

Achieved 78% accuracy in sentiment identification by developing a recommender system using linguistics and contextual approaches with SVM and Bayes classifiers. Deepened content understanding by analyzing and predicting sentiments in reviews and comments using NLP methodologies, leveraging RNN and LSTM models to extract actionable insights.

June 2022 - July 2022

Contact me

Please contact me directly at pavanpandya.iu@gmail.com or through this form.