7 books for Machine Learning with R

Jason Brownlee put together a list of 7 machine learning books that make use of R:

In this post I want to point out some resources you can use to get started in R for machine learning.

7 books for Machine Learning with R


A Tour of Machine Learning Algorithms

After we understand the type of machine learning problem we are working with, we can think about the type of data to collect and the types of machine learning algorithms we can try. In this post we take a tour of the most popular machine learning algorithms. It is useful to tour the main algorithms to get a general idea of what methods are available.


A Tour of Machine Learning Algorithms


How to Implement a Machine Learning Algorithm

Jason Brownlee published an excerpt from his “Small Projects Methodology: Learn and Practive Applied Machine Learning” focusing on the process of implementing machine learning algorithms:

Implementing a machine learning algorithm in code can teach you a lot about the algorithm and how it works.

In this post you will learn how to be effective at implementing machine learning algorithms and how to maximize your learning from these projects.

If you think about it, the process of implementing machine learning algorithms is in many ways similar to how machine learning works.

How to Implement a Machine Learning Algorithm


The Machine learning skills pyramid

Created by Steve Geringer:

ML Skills Pyramid v1.0

Daniel Gutierrez

The Machine learning skills pyramid

Vowpal Wabbit - Open source machine learning

Found by Daniel Gutierrez from Inside BigData:

Vowpal Wabbit (aka VW) is an open source fast out-of-core learning system library and program started and led by John Langford who works at Microsoft Research New York. Vowpal Wabbit is notable as an efficient scalable implementation of online machine learning and support for a number of machine learning reductions, importance weighting, and a selection of different loss functions and optimization algorithms.

The project is on GitHub, there’s a short wiki page, and a presentation.


Bill Gates: Four Areas of Technology I’d look into

Bill Gates in a tweet-based interview:

Q: @fesja: @BillGates if you were 20 years old now, what would you do? which area?

A: Bill Gates: When it comes to technology, there are four areas where I think a lot of exciting things will happen in the coming decades: big data, machine learning, genomics, and ubiquitous computing. So if I were 20 years old today, I’d be looking into one (or maybe more!) of those fields.

To say that Bill Gates always had a great understanding of technology trends would be an understatement.

Bill Gates: Four Areas of Technology I'd look into

List of Machine Learning APIs

Below is a compilation of APIs that have benefited from Machine Learning in one way or another, we truly are living in the future so strap into your rocketship and prepare for blastoff.

List of Machine Learning APIs


Machine Learning Cheatsheets

Created by Andreas Mueller:


Then you can head to this Quora thread to read a bit more about the pros and cons of the different classification algorithms.

Machine Learning Cheatsheets

Machine Learning: Interesting Problems Are Never Off the Shelf

Aria Haghighi about the present and future of products based on machine learning:

But I think there’s an even bigger barrier beyond ingenious model design and engineering skills. In the case of machine translation and speech recognition, the problem being solved is straightforward to understand and well-specified. Many of the NLP technologies that I think will revolutionize consumer products over the next decade are much vaguer. How, exactly, can we take the excellent research in structured topic models, discourse processing, or sentiment analysis and make a mass-appeal consumer product?

Machine Learning: Interesting Problems Are Never Off the Shelf


Skytree Launches a MacHine Learning Server

Skytree Server connects to any number of existing data stores, including Hadoop, and, says Hack, is tens of thousands of times faster than existing tools, performing in minutes tasks that would have taken hours or days. As of now, it’s tuned to five specific use cases the company says are the most common — recommendation systems, anomaly/outlier identification, predictive analytics, clustering and market segmentation, and similarity search.

Skytree Server Architecture

There’s a limited but free Skytree version available on demand, so I expect to read some more about it soon.

Skytree Launches a MacHine Learning Server


Characteristics of Machine Learning Models

Ricky Ho published yet another great article giving a high level summary of the algorithms used by different machine learning models:

  • decision trees
  • linear regression methods
  • neural networks
  • bayesian networks
  • support vector machines
  • nearest neighbor

For classification and regression problem, there are different choices of Machine Learning Models each of which can be viewed as a blackbox that solve the same problem. However, each model come from a different algorithm approaches and will perform differently under different data set. The best way is to use cross-validation to determine which model perform best on test data.

Characteristics of Machine Learning Models


Machine Learning, Hadoop, and Mahout

The presentation Cloudera Data Science team (Josh Wills, Tom Pierce, Jeff Hammerbacher) gave a couple of days ago on the state of machine learning and Hadoop.

Supervised Learning Workflow