Unlike the last couple of years (e.g., 2019), I’m lopping off the “Presentation” portion and focusing more on what I want to learn. Presentations will follow from some of this but there are few guarantees. I’m going to break this up into sections because if I just gave the full list I’d demoralize myself.
The Wide World of Spark
It’s a pretty good time to be working with Apache Spark, and I’m interested in deepening my knowledge considerably this year. Here’s what I’m focusing on in this area:
- Azure Databricks. I’m pretty familiar with Azure Databricks already and have spent quite a bit of time working with the Community Edition, but I want to spend more time diving into the product and gain expertise.
- Spark.NET, particularly F#. Getting better with F# is a big part of my 2020 learning goals, and so this fits two goals at once.
- SparkR and sparklyr have been on my radar for quite a while, but I’ve yet to get comfortable with them. That changes in 2020.
- Microsoft is putting a lot of money into Big Data Clusters and Azure Synapse Analytics, and I want to be at a point in 2020 where I’m comfortable working with both.
- Finally, Kafka Streams and Spark Streaming round out my list. Kafka Streams isn’t Spark-related, but I want to be proficient with both of these.
I’m eyeing a few Azure-related certifications for 2020. Aside from the elements above (Databricks, Big Data Clusters, and Synapse Analytics), I’ve got a few things I want to get better at:
- Azure Data Factory. I skipped Gen1 because it didn’t seem worth it. Gen2 is a different story and it’s about time I got into ADF for real.
- Azure Functions, especially F#. F# seems like a perfect language for serverless operations.
- Azure Data Lake Gen2. Same deal with ADF, where Gen1 was blah, but Gen2 looks a lot better. I’ve got the hang of data lakes but really want to dive into theory and practices.
I released my first F# projects in 2019. These ranged from a couple F#-only projects to combination C#-F# solutions. I learned a huge amount along the way, and 2020 is a good year for me to get even more comfortable with the language.
- Serverless F#. This relates to Azure Functions up above.
- Fable and the SAFE Stack. This is a part of programming where I’ve been pretty weak (that’s the downside of specializing in the data platform side of things), so it’d be nice to build up some rudimentary skills here.
- Become comfortable with .NET Core. I’ve been a .NET Framework kind of guy for a while, and I’m just not quite used to Core. That changes this year.
- Become comfortable with computational expressions and async in F#. I can use them, but I want to use them without having to think about it first.
- Finish Domain Modeling Made Functional and Category Theory for Programmers. I’ve been working through these two books and plan to complete them.
- Get more active in the community. I’ve created one simple pull request for FSharp.Data.SqlClient. I’d really like to add a few more in areas where it makes sense.
Data Science + Pipelines
I have two sets of goals in this category. The first is to become more comfortable with neural networks and boosted decision trees, and get back to where I was in grad school with regression.
The other set of goals is all about data science pipelines. I think you’ll be hearing a lot more about this over the next couple of years, but the gist is using data science-oriented version control systems (like DVC), Docker containers governed by Kubernetes, and pipeline software like MLFlow to build repeatable solutions for data science problems. Basically, data science is going through the maturation phase that software development in general went through over the past decade.
This last set of goals pertains to video editing rather than a data platform or programming topic. I want to make 2020 the year of
the Linux desktop video production, and that means sitting down and learning the software side of things. I’m including YouTube tutorials and videos as well as improving my use of OBS Studio for TriPASS’s Twitch channel. If I get good enough at it, I might do a bit more with Twitch, but we’ll see.
Looking back at the list, it’s a pretty ambitious goal. Still, these are the areas where I think it’s worth me spending those crucial off-work hours, and I’d expect to see three or four new presentations come out of all of this.