This book is an excellent primer on data science. It builds up concepts from scratch with code examples in Python. Whilst it uses some well-known libraries for utilities, the code that builds on the core Data Science concepts is all included and explained in the book.
I particularly enjoyed the conversational, often humorous style of the book. He gives a short introduction to NoSQL databases, then concludes: “Tomorrow’s flavour of the day might not even exist now, so I can’t do much more than let you know that NoSQL is a thing. So now you know. It’s a thing”. The author doesn’t get too stuck in jargon either – one example is his definition of a greedy algorithm: “… at each step, it chooses the most immediately best option” – perfect.
Some of the main topics covered are:
- Visualizing Data
- Gradient Descent
- Linear Regression
- Logistic Regression
- Neural Networks
Having covered the theory, the book extends to a few use cases – natural language processing, network analysis and collaborative filtering.

I’ve been learning Python for a few months and am starting to use it at work, so I thought it was about time to read a book about Python. This book has been excellent. No only does it follow format of the time-honoured ‘Effective’ series pioneered by Scott Meyers, it also features practical, useful code examples. In particular, his JsonMixin class was immediately relevant to some work I’ve been doing to generically serialise to/from JSON documents – see Item 26 “Use Multiple Inheritance Only for Mix-In Utility Classes”.

I recently wrote about