- basic Python (Jupyter Notebook, NumPy, Matplotlib, NLTK, and RegEx) • intermediate pandas
- feature extraction with NumPy • text vectorization with TF-IDF and SVD • feature engineering with pandas • interactive data visualization with Matplotlib • data augmentation for ML with object-oriented programming (OOP) • statistical modeling with SciPy

The tasks you’ll tackle in this series of liveProjects are typical of tasks a data scientist/engineer would encounter in an online recruiting tech company, a large organization’s HR department, or similar environments. You’ll develop a data pipeline for processing, extracting, and transforming various types of data to be consumed by different types of users, including machine learning engineers, data analysts, and product developers. You’ll build data processing tools with NumPy, use pandas for feature extraction and engineering, use Matplotlib to explore, visualize, and analyze processed data, and build data augmentation tools to enhance the ML modeling. By the end, you’ll have already finished 80% of the work of a typical data science project. You’ll have acquired skills, experience, and confidence that will take you closer to a career in data science.

Project 1 Data Processing Tools with NumPy

As a data engineer in an online recruiting tech company or a large organization’s HR department, you’ll build a series of practical tools to process and extract useful information from unstructured text data using NumPy. You’ll learn important methods (including trie data structure, TF-IDF, SVD), how to implement them, and their applications in the real world. When you’re finished, you’ll have the know-how to build data processing tools that meet the needs of machine learning engineers, data analysts, and product developers.

Project 2 Pandas for Feature Extraction

Project 3 Data Visualization for Exploratory Analysis

Project 4 Data Augmentation for ML

These liveProjects are for Python beginners who are passionate about data and who would like to advance their careers as data analysts, data engineers, or data scientists. To begin these liveProjects you’ll need to be familiar with the following:

TOOLS- Basic Python
- Basic Jupyter Notebook
- Basic NumPy
- Intermediate pandas
- Basic Matplotlib
- Basic NLTK
- Basic RegEx

- Basic matrix operations
- Basics of trie data structure
- Basics of TF-IDF, SVD
- Basics of tokenization and text cleaning
- Basics of plot types
- Basic statistics

In this liveProject series, you’ll learn to build data processing, data augmentation, feature extraction and engineering tools, and create interactive data analytics dashboards for storytelling.

- Use built-in Python modules: string, RegEx (regular expression)
- Use NumPy for different matrix operations
- Use SciPy to compute cosine similarity
- Use stats modules for probability distribution fitting
- Use pandas for dataframe operations
- Matplotlib plot type
- ipywidgets for interactive widgets

