Skills
Technical
- Python | SQL | Java | Natural Language Processing | Machine Learning Algorithms | Neural Networks |
Regression Analysis | Predictive Analytics | Topic Modeling | Data Mining | Social Media Analysis |
Inferential Statistics | Data Transformation | ETL/ELT Workflows
Packages & Tools
- Pandas | NumPy | dbt Cloud | PyMongo | TensorFlow | PyTorch | BeautifulSoup |
OpenCV | Tableau | Matplotlib | NetworkX | Tweepy | X API | OpenAI API |
SQLite | PostgreSQL | Google Bigquery | Google Sheets | Microsoft Excel |
Git | GitHub
Certifications & Online Learning
- Google Data Analytics Professional Certificate |
dbt Fundamentals (dbt Labs) |
Neural Networks and Deep Learning (Deeplearning.AI) |
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization
and Optimization (Deeplearning.AI)
Projects
End-to-End Data Pipeline: United States Non-Immigrant Visa Analysis and Prediction
Led a 4-member team to collect, clean, and analyze extensive open data
for non-immigrant visa issuance to analyze global immigration to the United
States.
Wrapped up by developing an end-to-end data transformation and prediction pipeline
leveraging dbt and Python to streamline data processing, feature engineering, and predictive modeling.
Figuring out Neural Networks: Classifying Breast Cancer from Mammography Images
Trained a neural network to classify mammography images and compared its performance with transfer learning approaches. Also compared results with standard neural network architectures like ResNet50 and VGG16 used as feature extractors and achieved a maximum recall for positive class of 82%.
Harry Potter and the Next Word: LSTM RNNs for predicting the next word + Streamlit app interface
Processed, prepared and cleaned text data (movie script lines) using tecniques like tokenization, vocabulary creation, one hot encoding, n-grams and word embeddings to predict the next word given a sequence of words. A 2 layered stacked LSTM RNN was trained on script lines from the famous Harry Potter Movie franchise to predict next word. The model was tuned until a 15% increase in accuracy and 32% decrease in loss was achieved. Finally, the model was intergrated with a Streamlit web application.
View More on my GitHub
Maximize Marketing: A Data Analysis Case Study
Conducted comprehensive data analysis on fitness data to examine user behavior and identify trends. Utilized MS Excel functionalities such as Pivot Tables and VLOOKUP, and SQL features including JOINS and Common Table Expressions to clean, prepare, and query data. Shared actionable insights and recommendations through Tableau Dashboards and a GitHub Markdown report.
Predicting with Volatility: Stacked LSTMS for AMZN price forecasting
The aim was to observe the ability of these networks to forecast when there are sudden changes in the input sequence values. Trained a Stacked LSTM RNN to predict and forecast the opening price for the AMZN stock. Utilized tiingo API to pull data. Achieved a RMSE value of 7.4 on test set and forecasted next 30 days.