Machine Learning: The Future has Arrived

October 8, 2016

I wanted to spend a good bit of time gaining deeper knowledge and more experience with machine learning and data science, but just ran out of time this week.

Why Machine Learning?

It’s the future, right now.

It’s the equivalent of finding a O(log α(n)) solution for a O(n!) problem. (That α is the inverse Ackermann function, a ridiculously slow-growing function.)

Instead of spending 6 months writing an algorithm and thousands of rules to define how to solve a very hard problem, you let machine learning learn the rules from examples. It is, in a way, writing its own algorithm from the data you give it.

Most of the time spent doing data science is actually spent gathering and cleaning data (made easy with a little programming).

Google is using machine learning right now, and they see the promise and results already, but only a small percentage of Google engineers have experience with it.

Applications of machine learning at Google:

Intro Course on Machine Learning

There are many online courses on the subject, including a Self-Driving Car Engineer Nanodegree (I want that!), but the best intro to understand the math, statistics, and process behind machine learning is Stanford’s Machine Learning Course:

Concretely, Andrew Ng rocks.

I took this course many months ago, and I found it fascinating. It has some very serious math:

Sometimes I ask myself what I’ve gotten myself into. #MachineLearning #neuralNetworks pic.twitter.com/EvJ8mMSNSK
— John Washam (@StartupNextDoor) February 26, 2016

Don’t let it scare you.

The math builds up slowly so you can follow along, and the first week includes a review of linear algebra, which I hadn’t seen since high school.

The course uses Matlab and Octave. You get a free license of Matlab to use during the course but I used Octave, which is an open-source alternative with similar syntax. It can also read and write Matlab files.

Next Steps

Matlab and Octave are great, and Matlab is widely used, very expensive, and has a language of its own.

The two main languages (other than Matlab-compatible) used in data science and machine learning are Python and R. Python has packages like scikit-learn and numpy that you can use and avoid implementing your own regressors and classifiers. In just a few lines of code, you can implement some very cool technology. In addition, Tensorflow, an open-source package for building neural networks, gives you a neural network in just a few lines.

You can get started with machine learning today, without any knowledge of it. Here is a short playlist of tutorials by Josh Gordon to get you started. You’ll see how easy it can be:

More Learning

Books! How I love them.

Python Machine Learning

This book is a best-seller, and very well-reviewed. It will be the first book I tackle when I have the time.

Python Machine Learning by Sebastian Raschka

Python Machine Learning

Data Science from Scratch

Another best seller, by an ex-Googler, no less.

Data Science from Scratch by Joel Grus

Data Science from Scratch

Introduction to Machine Learning with Python

This is a preorder, but looking at the table of contents and skimming some of the content, it looks quite promising.

Introduction to Machine Learning with Python by Andreas C. Müller and Sarah Guido

Introduction to Machine Learning with Python

Resources

Google’s Cloud Machine learning tools (video)
Tensorflow (video)
Tensorflow Tutorials
Courses:
- Stanford: Machine Learning
  - videos only
  - see videos 12-18 for a review of linear algebra (14 and 15 are duplicates)
- Neural Networks for Machine Learning
- Google’s Deep Learning Nanodegree
- Google/Kaggle Machine Learning Engineer Nanodegree
- Self-Driving Car Engineer Nanodegree
- Metis Online Course ($99 for 2 months)

John Washam

Senior Software Engineer at nowhere right now. My opinions are my own and not that of my employer, at any time. Founder of TalkToTheManager and zKorean. I like drawing. I have several lifetimes worth of pencils.