4.53 out of 5
4.53
14163 reviews on Udemy

Machine Learning, Data Science and Deep Learning with Python

Complete hands-on machine learning tutorial with data science, Tensorflow, artificial intelligence, and neural networks
Instructor:
Sundog Education by Frank Kane
91,136 students enrolled
English More
Build artificial neural networks with Tensorflow and Keras
Make predictions using linear regression, polynomial regression, and multivariate regression
Classify images, data, and sentiments using deep learning
Implement machine learning at massive scale with Apache Spark's MLLib
Understand reinforcement learning - and how to build a Pac-Man bot
Classify data using K-Means clustering, Support Vector Machines (SVM), KNN, Decision Trees, Naive Bayes, and PCA
Use train/test and K-Fold cross validation to choose and tune your models
Build a movie recommender system using item-based and user-based collaborative filtering
Clean your input data to remove outliers
Design and evaluate A/B tests using T-Tests and P-Values

New! Updated for TensorFlow 1.10

Machine Learning and artificial intelligence (AI) is everywhere; if you want to know how companies like Google, Amazon, and even Udemy extract meaning and insights from massive data sets, this data science course will give you the fundamentals you need. Data Scientists enjoy one of the top-paying jobs, with an average salary of $120,000 according to Glassdoor and Indeed. That’s just the average! And it’s not just about money – it’s interesting work too!

If you’ve got some programming or scripting experience, this course will teach you the techniques used by real data scientists and machine learning practitioners in the tech industry – and prepare you for a move into this hot career path. This comprehensive machine learning tutorial includes over 80 lectures spanning 12 hours of video, and most topics include hands-on Python code examples you can use for reference and for practice. I’ll draw on my 9 years of experience at Amazon and IMDb to guide you through what matters, and what doesn’t.

Each concept is introduced in plain English, avoiding confusing mathematical notation and jargon. It’s then demonstrated using Python code you can experiment with and build upon, along with notes you can keep for future reference. You won’t find academic, deeply mathematical coverage of these algorithms in this course – the focus is on practical understanding and application of them. At the end, you’ll be given a final project to apply what you’ve learned!

The topics in this course come from an analysis of real requirements in data scientist job listings from the biggest tech employers. We’ll cover the machine learning, AI, and data mining techniques real employers are looking for, including:

  • Deep Learning / Neural Networks (MLP’s, CNN’s, RNN’s) with TensorFlow and Keras

  • Sentiment analysis

  • Image recognition and classification

  • Regression analysis

  • K-Means Clustering

  • Principal Component Analysis

  • Train/Test and cross validation

  • Bayesian Methods

  • Decision Trees and Random Forests

  • Multivariate Regression

  • Multi-Level Models

  • Support Vector Machines

  • Reinforcement Learning

  • Collaborative Filtering

  • K-Nearest Neighbor

  • Bias/Variance Tradeoff

  • Ensemble Learning

  • Term Frequency / Inverse Document Frequency

  • Experimental Design and A/B Tests

…and much more! There’s also an entire section on machine learning with Apache Spark, which lets you scale up these techniques to “big data” analyzed on a computing cluster. And you’ll also get access to this course’s Facebook Group, where you can stay in touch with your classmates.

If you’re new to Python, don’t worry – the course starts with a crash course. If you’ve done some programming before, you should pick it up quickly. This course shows you how to get set up on Microsoft Windows-based PC’s; the sample code will also run on MacOS or Linux desktop systems, but I can’t provide OS-specific support for them.

If you’re a programmer looking to switch into an exciting new career track, or a data analyst looking to make the transition into the tech industry – this course will teach you the basic techniques used by real-world industry data scientists. These are topics any successful technologist absolutely needs to know about, so what are you waiting for? Enroll now!

  • “I started doing your course in 2015… Eventually I got interested and never thought that I will be working for corporate before a friend offered me this job. I am learning a lot which was impossible to learn in academia and enjoying it thoroughly. To me, your course is the one that helped me understand how to work with corporate problems. How to think to be a success in corporate AI research. I find you the most impressive instructor in ML, simple yet convincing.” – Kanad Basu, PhD

Getting Started

1
Introduction

What to expect in this course, who it's for, and the general format we'll follow.

2
Udemy 101: Getting the Most From This Course
3
[Activity] Getting What You Need

We'll show you where to download the scripts and sample data used in this course, and where to put it.

4
[Activity] Installing Enthought Canopy

We'll install our Python 3.5 environment, Enthought Canopy, and install the Python libraries and packages we'll need for this course. When we're done, we'll do a quick test of running a real Python notebook!

5
Python Basics, Part 1 [Optional]

In a crash course on Python and what's different about it, we'll cover the importance of whitespace in Python scripts, how to import Python modules, and Python data structures including lists, tuples, and dictionaries.

6
[Activity] Python Basics, Part 2 [Optional]

In part 2 of our Python crash course, we'll cover functions, boolean expressions, and looping constructs in Python.

7
Running Python Scripts [Optional]

This course presents Python examples in the form of iPython Notebooks, but we'll cover the other ways to run Python code: interactively from the Python shell, or running stand-alone Python script files.

8
Introducing the Pandas Library [Optional]

Pandas is a library we'll use throughout the course for loading, examining, and manipulating data. Let's see how it works with some examples, and you'll have an exercise at the end too.

Statistics and Probability Refresher, and Python Practise

1
Types of Data

We cover the differences between continuous and discrete numerical data, categorical data, and ordinal data.

2
Mean, Median, Mode

A refresher on mean, median, and mode - and when it's appropriate to use each.

3
[Activity] Using mean, median, and mode in Python

We'll use mean, median, and mode in some real Python code, and set you loose to write some code of your own.

4
[Activity] Variation and Standard Deviation

We'll cover how to compute the variation and standard deviation of a data distribution, and how to do it using some examples in Python.

5
Probability Density Function; Probability Mass Function

Introducing the concepts of probability density functions (PDF's) and probability mass functions (PMF's).

6
Common Data Distributions

We'll show examples of continuous, normal, exponential, binomial, and poisson distributions using iPython.

7
[Activity] Percentiles and Moments

We'll look at some examples of percentiles and quartiles in data distributions, and then move on to the concept of the first four moments of data sets.

8
[Activity] A Crash Course in matplotlib

An overview of different tricks in matplotlib for creating graphs of your data, using different graph types and styles.

9
[Activity] Covariance and Correlation

The concepts of covariance and correlation used to look for relationships between different sets of attributes, and some examples in Python.

10
[Exercise] Conditional Probability

We cover the concepts and equations behind conditional probability, and use it to try and find a relationship between age and purchases in some fabricated data using Python.

11
Exercise Solution: Conditional Probability of Purchase by Age

Here we'll go over my solution to the exercise I challenged you with in the previous lecture - changing our fabricated data to have no real correlation between ages and purchases, and seeing if you can detect that using conditional probability.

12
Bayes' Theorem

An overview of Bayes' Theorem, and an example of using it to uncover misleading statistics surrounding the accuracy of drug testing.

Predictive Models

1
[Activity] Linear Regression

We introduce the concept of linear regression and how it works, and use it to fit a line to some sample data using Python.

2
[Activity] Polynomial Regression

We cover the concepts of polynomial regression, and use it to fit a more complex page speed - purchase relationship in Python.

3
[Activity] Multivariate Regression, and Predicting Car Prices

Multivariate models let us predict some value given more than one attribute. We cover the concept, then use it to build a model in Python to predict car prices based on their number of doors, mileage, and number of cylinders. We'll also get our first look at the statsmodels library in Python.

4
Multi-Level Models

We'll just cover the concept of multi-level modeling, as it is a very advanced topic. But you'll get the ideas and challenges behind it.

Machine Learning with Python

1
Supervised vs. Unsupervised Learning, and Train/Test

The concepts of supervised and unsupervised machine learning, and how to evaluate the ability of a machine learning model to predict new values using the train/test technique.

2
[Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression

We'll apply train test to a real example using Python.

3
Bayesian Methods: Concepts

We'll introduce the concept of Naive Bayes and how we might apply it to the problem of building a spam classifier.

4
[Activity] Implementing a Spam Classifier with Naive Bayes

We'll actually write a working spam classifier, using real email training data and a surprisingly small amount of code!

5
K-Means Clustering

K-Means is a way to identify things that are similar to each other. It's a case of unsupervised learning, which could result in clusters you never expected!

6
[Activity] Clustering people based on income and age

We'll apply K-Means clustering to find interesting groupings of people based on their age and income.

7
Measuring Entropy

Entropy is a measure of the disorder in a data set - we'll learn what that means, and how to compute it mathematically.

8
[Activity] Install GraphViz

In order to run the next lecture on decision trees, you'll need some software called "GraphViz" installed. Here's how.

9
Decision Trees: Concepts

Decision trees can automatically create a flow chart for making some decision, based on machine learning! Let's learn how they work.

10
[Activity] Decision Trees: Predicting Hiring Decisions

We'll create a decision tree and an entire "random forest" to predict hiring decisions for job candidates.

11
Ensemble Learning

Random Forests was an example of ensemble learning; we'll cover over techniques for combining the results of many models to create a better result than any one could produce on its own.

12
Support Vector Machines (SVM) Overview

Support Vector Machines are an advanced technique for classifying data that has multiple features. It treats those features as dimensions, and partitions this higher-dimensional space using "support vectors."

13
[Activity] Using SVM to cluster people using scikit-learn

We'll use scikit-learn to easily classify people using a C-Support Vector Classifier.

Recommender Systems

1
User-Based Collaborative Filtering

One way to recommend items is to look for other people similar to you based on their behavior, and recommend stuff they liked that you haven't seen yet.

2
Item-Based Collaborative Filtering

The shortcomings of user-based collaborative filtering can be solved by flipping it on its head, and instead looking at relationships between items instead of relationships between people.

3
[Activity] Finding Movie Similarities

We'll use the real-world MovieLens data set of movie ratings to take a first crack at finding movies that are similar to each other, which is the first step in item-based collaborative filtering.

4
[Activity] Improving the Results of Movie Similarities

Our initial results for movies similar to Star Wars weren't very good. Let's figure out why, and fix it.

5
[Activity] Making Movie Recommendations to People

We'll implement a complete item-based collaborative filtering system that uses real-world movie ratings data to recommend movies to any user.

6
[Exercise] Improve the recommender's results

As a student exercise, try some of my ideas - or some ideas of your own - to make the results of our item-based collaborative filter even better.

More Data Mining and Machine Learning Techniques

1
K-Nearest-Neighbors: Concepts

KNN is a very simple supervised machine learning technique; we'll quickly cover the concept here.

2
[Activity] Using KNN to predict a rating for a movie

We'll use the simple KNN technique and apply it to a more complicated problem: finding the most similar movies to a given movie just given its genre and rating information, and then using those "nearest neighbors" to predict the movie's rating.

3
Dimensionality Reduction; Principal Component Analysis

Data that includes many features or many different vectors can be thought of as having many dimensions. Often it's useful to reduce those dimensions down to something more easily visualized, for compression, or to just distill the most important information from a data set (that is, information that contributes the most to the data's variance.) Principal Component Analysis and Singular Value Decomposition do that.

4
[Activity] PCA Example with the Iris data set

We'll use sckikit-learn's built-in PCA system to reduce the 4-dimensions Iris data set down to 2 dimensions, while still preserving most of its variance.

5
Data Warehousing Overview: ETL and ELT

Cloud-based data storage and analysis systems like Hadoop, Hive, Spark, and MapReduce are turning the field of data warehousing on its head. Instead of extracting, transforming, and then loading data into a data warehouse, the transformation step is now more efficiently done using a cluster after it's already been loaded. With computing and storage resources so cheap, this new approach now makes sense.

6
Reinforcement Learning

We'll describe the concept of reinforcement learning - including Markov Decision Processes, Q-Learning, and Dynamic Programming - all using a simple example of developing an intelligent Pac-Man.

Dealing with Real-World Data

1
Bias/Variance Tradeoff

Bias and Variance both contribute to overall error; understand these components of error and how they relate to each other.

2
[Activity] K-Fold Cross-Validation to avoid overfitting

We'll introduce the concept of K-Fold Cross-Validation to make train/test even more robust, and apply it to a real model.

3
Data Cleaning and Normalization

Cleaning your raw input data is often the most important, and time-consuming, part of your job as a data scientist!

4
[Activity] Cleaning web log data

In this example, we'll try to find the top-viewed web pages on a web site - and see how much data pollution makes that into a very difficult task!

5
Normalizing numerical data

A brief reminder: some models require input data to be normalized, or within the same range, of each other. Always read the documentation on the techniques you are using.

6
[Activity] Detecting outliers

A review of how outliers can affect your results, and how to identify and deal with them in a principled manner.

Apache Spark: Machine Learning on Big Data

1
Warning about Java 11 and Spark 2.4!
2
Spark installation notes for MacOS and Linux users
3
[Activity] Installing Spark - Part 1

We'll present an overview of the steps needed to install Apache Spark on your desktop in standalone mode, and get started by getting a Java Development Kit installed on your system.

4
[Activity] Installing Spark - Part 2

We'll install Spark itself, along with all the associated environment variables and ancillary files and settings needed for it to function properly.

5
Spark Introduction

A high-level overview of Apache Spark, what it is, and how it works.

6
Spark and the Resilient Distributed Dataset (RDD)

We'll go in more depth on the core of Spark - the RDD object, and what you can do with it.

7
Introducing MLLib

A quick overview of MLLib's capabilities, and the new data types it introduces to Spark.

8
[Activity] Decision Trees in Spark

We'll take the same problem for our earlier Decision Tree lecture - predicting hiring decisions for job candidates - but implement it using Spark and MLLib!

9
[Activity] K-Means Clustering in Spark

We'll take the same example of clustering people by age and income from our earlier K-Means lecture - but solve it in Spark!

10
TF / IDF

We'll introduce the concept of TF-IDF (Term Frequency / Inverse Document Frequency) and how it applies to search problems, in preparation for using it with MLLib.

11
[Activity] Searching Wikipedia with Spark

Let's use TF-IDF, Spark, and MLLib to create a rudimentary search engine for real Wikipedia pages!

12
[Activity] Using the Spark 2.0 DataFrame API for MLLib

Spark 2.0 introduced a new API for MLLib based on DataFrame objects; we'll look at an example of using this to create and use a linear regression model.

Experimental Design

1
A/B Testing Concepts

Running controlled experiments on your website usually involves a technique called the A/B test. We'll learn how they work.

2
T-Tests and P-Values

How to determine significance of an A/B tests results, and measure the probability of the results being just from random chance, using T-Tests, the T-statistic, and the P-value.

3
[Activity] Hands-on With T-Tests

We'll fabricate A/B test data from several scenarios, and measure the T-statistic and P-Value for each using Python.

4
Determining How Long to Run an Experiment

Some A/B tests just don't affect customer behavior one way or another. How do you know how long to let an experiment run for before giving up?

5
A/B Test Gotchas

There are many limitations associated with running short-term A/B tests - novelty effects, seasonal effects, and more can lead you to the wrong decisions. We'll discuss the forces that may result in misleading A/B test results so you can watch out for them.

Deep Learning and Neural Networks

1
Deep Learning Pre-Requisites

If you skipped ahead, I'll show you where to get the course materials for just this section. And we'll cover some pre-requisite concepts for understanding how neural networks operate: gradient descent, autodiff, and softmax.

2
The History of Artificial Neural Networks

We'll cover the evolution of artificial neural networks from 1943 to modern-day architectures, which is a great way to understand how they work.

3
[Activity] Deep Learning in the Tensorflow Playground

Google's Tensorflow Playground lets you experiment with deep neural networks and understand them - without writing a line of code!

4
Deep Learning Details

Let's dive into the details on how modern multi-level perceptrons are trained and tuned.

5
Introducing Tensorflow

We'll cover Google's open-source Tensorflow Python library, and see how it can help you create and train neural networks.

6
[Activity] Using Tensorflow, Part 1

We'll use Tensorflow to create a neural network that classifies handwritten numerals from the MNIST data set. Part 1 of 2.

7
[Activity] Using Tensorflow, Part 2

We'll use Tensorflow to create a neural network that classifies handwritten numerals from the MNIST data set. Part 2 of 2.

8
[Activity] Introducing Keras

The Tensorflow 1.9 offers a higher-level API called Keras, and makes it easier to construct your neural networks. We'll use Keras to solve the same handwriting recognition problem - but with much less code.

9
[Activity] Using Keras to Predict Political Affiliations

As another hands-on example, we'll use Keras to build a neural network that learns how to determine if a politician is Republican on Democrat just based on their votes.

10
Convolutional Neural Networks (CNN's)

CNN's mimic your visual cortex, and can find features in one, two, or three-dimensional data even if you're not sure where exactly that feature is.

11
[Activity] Using CNN's for handwriting recognition

CNN's are better suited to image data, and we'll prove that by using a CNN in Keras on the MNIST data.

12
Recurrent Neural Networks (RNN's)

RNN's can handle sequences of data, like events over time or words in a sentence. Learn what's different about how they work, how they are trained, and ways to optimize them.

13
[Activity] Using a RNN for sentiment analysis

Let's implement a RNN in Keras to determine positive or negative sentiments for real movie reviews from IMDb!

14
The Ethics of Deep Learning

As with any new technology, sometimes we can become overzealous in how we use it. A few cautionary tales to make sure your deep learning work does more good than harm.

15
Learning More about Deep Learning

Some suggested resources for continuing your education on deep learning, artificial intelligence, and neural networks.

Final Project

1
Your final project assignment

It's time to apply what you've learned in this course! You'll be given a real data set of mammogram masses, and training data on whether these masses were determined to be benign or malignant. Apply the different machine learning techniques you've used in this course to find the best one for automating the diagnosis for these masses.

2
Final project review

I'll walk you through my own solution to your final project, so you can compare your results against mine.

You can view and review the lecture materials indefinitely, like an on-demand channel.
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don't have an internet connection, some instructors also let their students download course lectures. That's up to the instructor though, so make sure you get on their good side!
4.5
4.5 out of 5
14163 Ratings

Detailed Rating

Stars 5
7749
Stars 4
4914
Stars 3
1174
Stars 2
204
Stars 1
122
7e3bb054bed62b4a2a8e0839ea40556f
30-Day Money-Back Guarantee

Includes

12 hours on-demand video
5 articles
Full lifetime access
Access on mobile and TV
Certificate of Completion
Demos
Support Buy $0