shel.

Data Scientist

Experienced data scientist with advanced Python programming skills and expertise in Pandas, Numpy, Sci-Kit Learn, Matplotlib, SciPy for data manipulation and visualization. Proficient in SQL, MongoDB (geoJSON), and relational database design. +7 years of experience in complex analytics, collaborating cross-functionally, and delivering impactful results. Skilled in developing data pipelines, implementing advanced algorithms, deriving metrics, and creating compelling visualizations. Proficient in statistical modeling, hypothesis testing, dimensionality reduction, and current machine learning techniques. Experienced in Agile and Kanban for project management. Passionate about driving business impact and achieving company goals through data-driven decision-making.

Experience

Data Scientist

9/2021 - Present

  • Led and directed all data science projects, providing comprehensive leadership and guidance to ensure successful project delivery.
  • Utilized Agile methodologies with Jira and Kanban methodologies with Trello to streamline project management processes.
  • Facilitated effective communication and collaboration among team members using platforms such as Slack, ensuring seamless coordination and timely project completion.
  • Leveraged MongoDB with geoJSON data structures for efficient storage and retrieval of geolocation data, enabling visualization of spatial patterns and trends for informed decision-making.
  • Applied common data science frameworks like Keras Tensorflow and PyTorch to develop Convolutional Neural Networks (CNNs) for anomaly detection in clinical imaging data, improving diagnoses and subject care.
  • Demonstrated proficiency in Large Language Models (LLMs), particularly in sentiment analysis and natural language processing (NLP) tasks using survey data collected from millions of data points.
  • Conducted ETL and analysis of client data using a typical data science stack including Pandas, Numpy, Sci-Kit Learn, Matplotlib, and SciPy, translating findings into actionable strategies fulfilling business objectives.
  • Implemented customer segmentation and analysis using machine learning techniques such as K-means, PCA, and UMAP to identify key segments and uncover trends for strategic decision-making.
  • Developed interactive dashboards with Plotly Dash and Tableau to visually showcase data trends, enabling stakeholders to make data-driven decisions and achieve organizational success.
  • Aligned client objectives with KPIs and metrics tailored to personalized self-care, health, and beauty trends, ensuring precise strategic insights for informed decision-making.

Data Scientist

9/2021 - 9/2022

  • Collaborated cross-functionally throughout the end-to-end data science lifecycle, including data wrangling, exploratory analysis, hypothesis testing, modeling, rapid prototyping, validation/testing, and deployment.
  • Applied advanced analytical techniques such as predictive modeling, machine learning, and optimization.
  • Utilized diverse structured and unstructured data to derive meaningful insights for modeling.
  • Effectively communicated complex analytical work to technical and non-technical stakeholders.
  • Maintained knowledge of emerging data science techniques, technologies, and ML/AI applications.
  • Launched a decision-making analytic platform as Product Owner, leveraging Agile and Scrum.
  • Partnered with business leaders, delivering high-impact data products aligned with company strategy and customer experience.

Postdoctoral Researcher

9/2020 - 9/2021

  • Created, designed, and developed analytical pipelines for big data, e.g., next-generation Sequencing (NGS Data).
  • Performed extraction of data from relevant databases using Python in UNIX and Cloud/High-Performance Computing (HPC) environments (data management).
  • Lead several lab projects; Collaborated and reviewed protocols of research methodology; Assisted in planning and documentation of group projects and grants.

Graduate Researcher & Teaching Assistant

1/2017 - 9/2020

  • Dissertation focus: Annotation of A. sativa genome (Illumina and PacBio) and identification of transposable elements.
  • Designed, developed, and deployed a real-time analytical pipeline utilizing data extracted from large bioinformatics databases (NCBI, Genbank, EMBL); Developed in an academic Agile environment throughout the development cycle.
  • Instructed Masters and Undergrad courses and performed research (BINF 3201, Genomic Methods for Bioinformatics); Introduced students to various technologies and methodologies utilized in bioinformatics and biotechnology industries.

Education

Doctorate of Philosophy, Bioinformatics (Data Science)
University of North Carolina at Charlotte
2020

Masters of Science, Bioinformatics (Data Science)
University of North Carolina at Charlotte
2016

Bachelor of Science, Biology
University of North Carolina at Charlotte
2013

Skills

  • Machine Learning
  • Statistical Analysis
  • Data Visualization
  • Python Programming
  • Big Data Analytics
  • SQL
  • Data Wrangling

Proficient in various data science skills, including machine learning, statistical analysis, data visualization, Python programming, big data analytics, SQL, and data cleaning and preprocessing.

Projects

Comprehensive showcase of diverse and impactful projects demonstrating expertise in data science and bioinformatics.

AnomalySense

A convolutional neural network that automatically detects and categorizes anomalous images, streamlining outlier identification and improving accuracy, efficiency, and data quality.

  • An advanced deep learning system written using Tensorflow to automatically detect and tag anomalous images within a large database.
  • The neural network identified patterns and features within input images, enabling classification into user-defined classes.
  • The architecture includes convolutional and pooling layers that learn visual features hierarchically. Fully connected layers perform the final classification, ensuring the accurate categorization of images into the specified classes.
  • This streamlined the process of identifying outliers and improved accuracy, efficiency, and cost savings by leveraging cutting-edge neural network technologies.

PrecisionTone

An innovative skin-tone classification schema based on an extensive internal database.

  • A state-of-the-art skin-tone classification system by leveraging k-means clustering on a rich and diverse skin-tone database collected from within the company.
  • Employed advanced data-driven techniques to perform accurate and precise categorization of skin tones, resulting in significantly improved classification results and reduced ambiguity.
  • Pioneered novel methodologies in data analysis and clustering to push the boundaries of skin-tone classification, making significant contributions to the field of data science.

Insights Dashboard

A user-friendly and engaging dashboard for client business intelligence.

  • Identified and reported on client data using an interactive dashboard to provide businesses with insight into client performance.
  • Automated reporting of KPI data per client on a daily, weekly, and quarterly interval.
  • Displayed predicted performance with time series data relating to KPIs; Forecasting performance weekly with data retrieval from AWS Athena (Postgres/SQL).

Get in touch

Drop a message by filling in the required fields with your name, email address, and message, and click the 'Send Message' button.