9 years in the Data Science/Machine Learning market space.
Develop neural networks architectures from scratch, such as Convolutional (CNNs), LSTMs, and
Transformers.
Built unsupervised approaches such as K-Means, Gaussian Mixture Models, and Auto-Encoders.
Program in R, Python, SQL, Spark, Scala, and MatLab.
Handle visualizations using R-Programming, Ggplot2, Plotly, Matplotlib, and Tableau for end-user
ad-hoc reporting.
Design custom BI reporting dashboards using Shiny, Shinydashboard, and Plotly for providing
actionable insights and data-driven solutions.
Create analytical models, algorithms, and custom software solutions based on accurate
understanding of business requirements.
Experienced in all supervised machine learning manners – Linear Regression, Logistic Regression,
Support Vector Machines, Random Forests, Gradient Boosting, Survival Modeling, NumPy stack
(NumPy, SciPy, Pandas, and matplotlib) and Sklearn.
Skilled with TensorFlow and PyTorch for building, validating, testing, and deploying reliable deep
learning algorithms for specific business challenges.
Experience with ensemble algorithm techniques, including Bagging, Boosting, and Stacking.
Applied knowledge with Natural Language Processing (NLP) methods, in particular FastText,
Word2vec, and Sentiment Analysis.
Apply Naïve Bayes, Regression and Classification Analysis, Neural Networks / Deep Neural
Networks, Decision Tree / Random Forest and Boosting machine learning techniques.
Apply machine learning, deep learning, CNN, Naïve Bayes, Regression Analysis, Neural Support
Vector Machines (SVM), and Random Forest machine learning techniques.
Implement statistical models on Big Data sets using cloud/cluster computing assets with AWS and
Azure.
Apply statistical analysis and machine learning techniques to live data streams from Big Data
sources using Spark and Scala.
Creative thinker with strong ability to devise and propose innovative ways to look at problems by
using business acumen, mathematical theories, data models, and statistical analysis.
Develop predictive models using Decision Tree, Random Forest and Naïve Bayes.
Develop regression, classification, and recommender systems with large datasets in distributed
systems and constrained environments.
Experienced in Python to manipulate data for data loading and extraction and worked with Python
libraries like MatPlotLib, NumPy, SciPy, and Pandas for data analysis.
Skilled using Python, R, SQL, Hadoop ecosystem for extracting data and building predictive
models.
Experience in statistical models on large data sets using cloud computing services such as AWS,
Azure and GCP.
Technology Skills
Programming - Python, Spark, SQL, R, Git, MATLAB, Bash
Analytical Methods - Advanced Data Modeling, Regression Analysis, Predictive Analytics,
Statistical Analysis (ANOVA, correlation analysis, t-tests and z-tests, descriptive statistics),
Sentiment Analysis, Exploratory Data Analysis. Time Series analysis (ARIMA) and forecasting
(TBATS, LSTM, ARCH, GARCH), Principal Component Analysis (PCA) and SVD; Linear and
Logistic Regression, Decision Trees and Random Forest.
Machine Learning - Supervised and unsupervised Learning algorithms, Machine Learning, Natural
Language Processing, Deep Learning, Data Mining, Neural Networks, Naïve Bayes Classifier,
Clustering, (K-MEANS, GMMs, DBSCAN), PCA, SVD, ARIMA, Linear Regression, Lasso and
Ridge, Logistic Regression, Ensemble, Classifiers (Bagging, Boosting and Voting), Ensemble
Regressors, KNN.
Libraries - NumPy, Pandas, Scipy, Scikit-Learn, Tensorflow, Keras, PyTorch, StatsModels,
Prophet, lifelines, PyFlux.
IDE - Pycharm, Sublime, Atom, Jupyter Notebook, Spyder.
Version Control - GitHub, Git, BitBucket, Box, Quip.
Data Stores - Large Data Stores, both SQL and noSQL, data warehouse, data lake, Hadoop
HDFS, S3.
RDBMS - SQL, MySQL, PL/SQL, T-SQL, PostgreSQL.
Data Visualization - Matplotlib, Seaborn, rasterio, Plotly, Bokeh.
NLP - NLTK, Spacy, Gensim, Bert, Elmo.
Cloud Data Systems - AWS (RDS, S3, EC2, Lambda), Azure, GCP.
Computer Vision - Convolutional Neural Network (CNN), Faster R-CNN, YOLO.
Experience
September 2020 to Current
AT&T, San Antonio TX
Senior NLP Engineer/Data Scientist
As a NLP Engineer at AT&T, I lead a team that designs and maintains NLP models to interact with
customers in verbal and textual communication. My team utilizes Neural Networks to group, cluster, and
classify communication types and provide sentiment and Chatbot data. My team also works with voice-
to-text and chat data and implements BERT and Doc2VEc embeddings as well as Ad-Hoc and pre-built
chatbot solutions such as DialogFlow. The team is primarily focused on NLP Data from raw sources to be
PyStreamed into our data lake. Data is transformed using inhouse apps,
Hands-on technical work:
Pull out data from HDFS data from JSON and Parquet files.
Deploy Clustering models on top of text training data and use unsupervised models.
Keep training data with new models.
Apply Sentiment analytics.
Utilize supervised and unsupervised NIP models.
Build models for Intent classification and speech tagging.
Discern meaningful insights from large datasets, define key metrics to track performance, and
develop actionable solutions to tackle business problems.
Use the terabyte-scale data lake to identify opportunities to enhance understanding of the customer
base and content consumption.
Use visualization tools such as Tableau and PowerBI, to present complex models and business
insights in a simple, engaging manner for business stakeholders.
Work closely with engineers to develop and deploy scalable machine-learning applications.
Identify opportunities for new data projects.
Apply productionising machine-learning models.
Apply strong foundation in inferential statistics and machine-learning algorithms.
July 2018 to September 2020
Schlumberger-Doll Research (SDR) Center, Cambridge MA/ Youngsville, LA
Algorithm Scientist
Worked in Drilling and Measurements division with focus on logging well formation data. Our team was
focused on developing autonomous systems that facilitated energy access while reducing greenhouse
gas emissions in alignment with our sustainability ambitions. The team invented and prototyped robotics
systems that operated under water and underground. I led the AI-based algorithm development to
improve robustness and scalability of robotics-based drilling operations. Duties included operating and
maintaining Measurement While Drilling and Logging While Drilling tools while on rig site; frequently
performing full rigup and rigdown of logging unit and sensors; and diagnosing and resolving problems
related to tool and/or rig performance for maximized efficiency. Solutions developed for Bayesian state
estimation, uncertainty quantification, path planning and robotics control. On a similar project, I worked
with the Robotics division to improve Computer Vision Algorithms for automated tool monitoring and
security.
Hands-on technical work:
Developed solutions for Bayesian state estimation, uncertainty quantification, path planning, and
robotics control.
Led the adoption of AI-based algorithms that improved robustness and scalability.
Analyzed Big Data sets to assess and correct performance deficiencies.
Validated and assessed algorithm performance on real and synthetic data.
Developed and maintained processes and supporting tools for information and data control.
Interfaced information and data control resources with partners, vendors, regulatory agencies, and
other external bodies, keeping distribution contacts current.
Extracted data from well-logging systems (e.g., OpenWells, CasingWear, StressCheck, Well Cost,
and Campos, among others) to build machine-learning algorithms to solve various problems.
With the PyTorch Python API, the team built the architecture and trained the convolutional neural
networks (CNN).
Exploited transfer learning with custom-built classifiers in PyTorch to speed up production time and
improve results.
Fine-tuned ResNet-50, ResNet-101, and ResNet-152 models to adapt their pre-trained weights to
our use case.
Used a fully convolutional network (FCN) - pre-trained YOLO v3 algorithm - to speed up prediction.
Used Logistic Regression to predict whether there would be deviation at any given well depth in a
drilling operation.
Used XGBoost with IoT data to predict Torque and Drag to minimize well casing and formation
damage. Increased production by 10,000 barrels per day in 5 production wells and increased
monthly revenue by US $17.4million.
Used NLP to do Sentiment analysis and then LDA to generate topics from the sentiment
categories.
Processed huge datasets (over billion data points and 2 TB in size) for data association pairing and
provided insights into meaningful data association and trends.
Deployed machine learning models on Azure Stack (Disconnected on drilling rigs), while ingested
data from IoT sensors.
Used Python 3.0 (NumPy, SciPy, Pandas, SciKit-Learn, Seaborn, NLTK) and Spark 2.0 (PySpark,
MLlib) to develop variety of models and algorithms for analytic purposes.
January 2015 to June 2018
Dominion Energy, Richmond VA
Energy Markets ML Scientist
As Energy Market DS, my principal responsibilities were to research, compile, and manipulate energy
industry competitive intelligence information using industry-related publications, databases, and other
sources. I analyzed energy markets and evaluated the economics of specific projects. Developed
recommendations for new or improvements to existing research tools. Assisted with development and
maintenance of proprietary in-house and other forecasts, structural databases, and models of regional
energy markets, including quantifying ranges on potential outcomes. Performed qualitative and
quantitative analysis and quality control on large amounts of data. The principal goal was to generate
effective near and short-term electrical energy demand modeling as well as optimum supply mixture
modeling.
Hands-on technical work:
Applied multiple approaches for predicting day-ahead energy demand with Python, including
exponential smoothing, ARIMA, Prophet, TBATS, and RNNs (LSTM).
Successfully built a Generalized Autoregressive Conditional Heteroskedasticity (GARCH) using
PyFlux to model the uncertainty of Dominion’s other time series, ensuring a ‘safety’ stock of
generating units.
Incorporated geographical and socio-economic data scraped from outside resources to improve
accuracy.
Validated models using a train-validate-test split to ensure forecasting was sufficient to elevate
optimal output of the number of generation facilities to meet system load.
Prevented over-fitting with the use of a validation set while training.
Built a meta-model to ensemble the predictions of several different models.
Performed feature engineering with the use of NumPy, Pandas, and FeatureTools to engineer time-
series features.
Coordinated with facility engineers to understand the problem and ensure our predictions were
beneficial.
Participated in daily standups working under an Agile KanBan environment.
Queried Hive by utilizing Spark through the use of Python’s PySpark Library.
March 2014 to December 2014
Xtreme Drilling & Coil Services, Jourdanton, TX
Coiled Tubing Engineer
Worked in Coiled Tubing Services Branch, duties similar to those undertaken at Cetco Energy Services.
Additional hands-on work:
Provided technical support for field crews and sales.
Modelled job simulations.
Monitored CT fatigue and daily activity.
Prepared post job reports.
Performed on-site supervision.
Reviewed real-time analysis reports.
Interacted with clients and 3rd-party representatives.
August 2012 to February 2014
Cetco Energy Services, New Iberia, LA & Robstown, TX
Failure Prediction Analyst, Data Scientist
Worked in Coiled Tubing Division, first in New Iberia branch, then Robstown TX. Duties included
providing technical support for field crews and sales; preparing cost estimates and quotes; modeling CT
interventions and preparing model reports; tracking CT fatigue; maintaining data acquisition systems; on-
site supervision and real-time analysis; and interaction with client engineers. Later, I built algorithmic
predictions of equipment failure using Cox Proportional Hazards and Accelerated time to failure models.
This was performed to support the automation of various routine manufacturing processes by predicting
time-to-failure to prevent extended downtime and schedule appropriate preventative maintenance.
Incorporated IoT data for up-to-date predictions. We focused on generating automated system alerts and
predictive solutions to increase the reliability of the plants under reduced staff.
Hands-on technical work:
Applied survival analysis techniques and machine-learning algorithms to improve how the
manufacturing teams could predict part failures.
Hands-on with data mining methods (e.g., hypothesis testing, regression analysis and various other
statistical analysis and modeling methods).
Presented weekly updates to managers and key stakeholders to preview the user interface designs
and analytical results of stress analysis findings, etc.
Presented using PowerPoint, Tableau, Excel for data work and charts.
Participated in Software Development Life Cycle (SDLC), including Requirements Analysis, Design
Specification and Testing following Agile methodologies. Operated in 2-week sprints, and weekly
stand-ups.
Worked in Git development environment.
Responsible for preparation for data for collaboration with machine learning models.
Used Python to create a semi-automated conversion process to generate raw archive linked data
file.
Provided software training and further education about model applications to incoming team.
Initial findings reported for conversion of Excel to CSV, text to CSV and image to CSV.
Collaborated with the computer vision team to better understand how to extract meaning from
images and PDF files.
Used Predictive Modeling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical Testing,
and Normal Distribution.
The project was implemented with custom APIs in Python and use of visualization tools such as
Tableau and Ggplot to create dashboards.
August 2008 to August 2012
US Nuclear Regulatory Commission, Rockville, MD
General Engineer (Professional Development Program)
Worked in Reactor Systems, Project Management, and Nuclear Materials Safeguards branches .
Hands-on technical work:
Reviewed license amendments and wrote safety evaluations for several plants' specs.
Reviewed core modeling code for new reactor designs.
Provided safety analysis for fuel cladding.
Assisted with Power Uprate Program monthly duties, status reports, and website updates.
Worked with Project Management specifics for the Fermi 2 nuclear power plant; taking BWR-
emphasized training courses.
Education
Master of Science in Mechanical Engineering
Thesis: “Cyber Physical System Modeling of Smart Charging Process”
University of Louisiana at Lafayette
Bachelors Science in Nuclear Engineering
Purdue University, Indiana