What: Triad SQL Server BI User Group
Where: 4821 Koger Boulevard, Greensboro, North Carolina
When: Tuesday, April 30th, 6 PM
Admission is free. Register on Meetup.
What I’m Presenting
06:30 PM — 08:00 PM — Launching a Data Science Project: Cleaning is Half the Battle
The Triad user groups are always a pleasure to visit. I don’t have a chance to get out that way very often considering how provincial I am, but if you are in the area, stop on by.
This is a review of Jamie Dixon’s book Mastering .NET Machine Learning. For the sake of transparency, Jamie is a friend and if I hated his book I wouldn’t badmouth it in public, so adjust your priors accordingly.
I enjoyed this book. Jamie wrote his book before Microsoft started on the ML.NET library, so he focuses on Accord. The book starts with a little bit on C# but quickly moves to F# to cover a number of scenarios around data science in .NET including data cleansing, regression analysis, clustering, cross-validation, and a bit on neural networks and even IoT.
On the plus side, Jamie’s conversational tone is evident throughout. He makes it easy for .NET developers new to F# and the world of data science to get a toehold in the field and I don’t think there’s a part of the book which is overly complex. Jamie also has a couple of consistent themes he uses throughout the book, including Adventure Works data and open data sources such as traffic stops.
Speaking of traffic stop information, I appreciate that Jamie includes failed scenarios and not just successes. Failure to find something interesting is at least as common as success, whether that be due to technique, incorrectly specified features, or simply a lack of correlation. But this does lead me to an issue I have with the book: you get invested in some of the data sets but I don’t think there’s a really good model in the end. For example, Jamie uses a few techniques to try to gain insights from traffic stop information but I don’t think he ever gets to a strong conclusion.
Overall, would I recommend this book? If you’re a .NET developer wanting to learn a bit about statistical analysis without wanting to learn R or Python, this is a good book for you. You still have to learn F# but the learning curve from C# to there is shallower than R or Python, I think…though only a little. I think the strongest parts of this book are where Jamie integrates data science with practical implementation like building a .NET website to act as a front end.