The problem with writing a book about a platform like Hadoop is that as soon as the book gets published, the material is already outdated. Programming Hive is just two years old and already has that problem. What it has going in its favor is a comprehensive look at how Hive works (for the most part, given how much has changed since its publication). I enjoy how the authors show Hive as a lot more than a simple SQL interface to Hadoop. Being able to create and maintain indexes, partition tables, and introduce Java user-defined functions give this language a lot of power. Pair it with a Hadoop platform (my favorite is the Hortonworks sandbox, but go with your preference if you have one) and run with it. Unfortunately, a lot of the examples won’t work exactly as written due to changes in the language, but when you run into those, you can either skip them or find out how to do it with a more recent version of Hadoop and Hive.
I have a number of presentations available right now. Most of those presentations have the same basic theme, but I’m generally okay with that: my slides aren’t flashy, but I want them to be informative and snappy over a service like WebEx or GoToMeeting.
There are a couple of changes that I’m going to look into making over the next couple of months. The first one is extremely simple but didn’t click until I read a Brent Ozar post on how presentations should end: every one of my slides needs a simple link to research I’ve done, as well as contact details (e-mail address and Twitter account). I probably will need to go back over my research notes for most of these talks, but that’s good for me: I get a chance to look back at my slides, think about how they read, and fix any problems I see.
The second fix is that I do want to introduce some graphics in meaningful locations. I’m not very good with graphics, and so I tend to focus on words rather than pictures. Mixing in a little of both should help improve my presentations and help people remember the points I’m making.
The third fix I want is to change how code blocks show up. Right now, the box is pretty narrow and doesn’t show up well on most older projectors. I want to expand it out and make it easier to read code snippets.
I think that with these three basic changes, I’ll be able to improve my slide decks considerably and help people retain more information.
I’m fortunate enough to be able to attend Brent Ozar Unlimited’s 2015 SQLPASS FreeCon. I’m definitely looking forward to it, and have a few goals around it.
First, my immediate goal is to expand my network. Being in a room with 50 other people for an entire day with this goal in mind is fantastic.
My secondary goal is to give me ideas of what to learn. Right now, I’m moving in directions away from or orthogonal to SQL Server: Hadoop, F#, BIML, and getting back into web development, for starters. I find each of these valuable, but I’d really like to pick the brains of some very smart people and see what I am missing.
My tertiary goal is to expand my network further. Yeah, this sounds like my immediate goal, but in this case, I mean something slightly different. In the upcoming year, I want to get a lot more active in SQL Saturdays up and down the eastern seaboard, start presenting at more user groups (especially those outside of the Raleigh-Durham area), and try to develop some personal brand cache. This is another opportunity for me to pick the brains of people who’ve made the leap from smart person to smart person who people have heard of.
There’s a lot I want to do over the next year to help me support these goals, and re-focusing my portion of this blog to more technical discussions is part of that.
I’m going to start a new series in which I discuss books that I’m currently reading. In most cases, these will be books that I haven’t yet finished, so don’t spoil them…
The first book on my list is Tobias Weltner’s Mastering Powershell. It focuses on version 2 of Powershell and is a few years old at this point, but most of it remains relevant. It provides a nice introduction to the topic and is a good starting point for people who want to get beyond the basics of Powershell copy-pasta. I have a few other Powershell books on my tablet, and I recommend that if you’re going down this road, you get a few more as well—there are a number of things which have happened to Powershell over the past 5+ years, including improvements with asynchronous code, and so you’ll want to treat this more as a foundational work than the sum of all that is Powershell. With that in mind, it’s an easy read and generally a good book. Given that the price is right, I recommend it.
My topic of choice is a blog series by Kyle Kingsbury in which he sees what happens when you simulate partition failure on various open source data platforms. It’s an extremely interesting and well-written series, and although I cannot do it justice in my five-minute time limit, I’m going to give it my best.
The BIML series is now at a conclusion. Here’s a wrap-up of what we did:
In addition to this six-part series, we also covered a few side issues of interest:
Deleting Data In Batches
Generating An Empty Node
Tracing Foreign Key Dependencies
BIDS Helper Bug With NoMatch In Visual Studio 2013
IF Statements In Task Sections
This post will also live on as a separate page for posterity’s sake.
I’ve historically been a huge fan of Mladen Prajdic’s SSMS Tools Pack. It’s a full-featured tool kit which solves a lot of problems and I highly recommend it to anybody who uses SSMS. I’m not averse to spending money—after all, the other two products on my list are paid tools—but I wanted to see what else was out there.
The tool I landed on is SSMSBoost. SSMSBoost is a great tool with free and paid editions, but the only difference between the two is that you need to re-download the free edition every 120 days.
SSMSBoost has a huge number of features, but let me focus on a few that I absolutely love. First, in the event that SSMS crashes (say, due to running out of memory because you have too many tabs open), this product keeps track of all open tabs and re-opens them the next time Management Studio starts. You can also keep track of your history—important if you know you ran that query a day or two ago but closed the window since—and even re-open the previous tab like you can in a web browser.
The next feature I love is auto-replacements. Lots of tools have them, but this makes it easy.
In the above screenshot, I have a template for what goes inside my CATCH block. I also have a couple of commands for try-catch (trc) and try-catch inside a transaction (trct). That way, when I’m developing new code or bringing old code up to par, a few keystrokes goes a long way and I don’t need to have a text file with these formats. Something nice about these templates is that they have some level of automation, using things like current timestamps and SQL Server Management Studio templates.
The third feature I love is scripting out objects. If I’m connected to a database, I just need to move my cursor to a particular database object and hit F2. SSMSBoost will read the metadata for that object and script it out for me as though I had dug through the Object Explorer, found the object, and right-clicked and selected to script as Create. This can be a huge time-saver when you would otherwise need to scroll through thousands of objects and is one more case where you can avoid having to use the mouse while you’re typing.
The final feature I want to talk about is scripting and visualizing. Quite often, people will ask for one-off reports, making Excel another of my frequently-used tools. Instead of copying results to Excel manually, I can script my result set as an Excel document and send it their way. Normally I’d do a little bit of document tweaking—formatting tables, adding totals, etc.—but this removes some of that pain. I can also script a result set as a SELECT statement, an INSERT statement, and even an HTML table. Visualizers, meanwhile, take data in a particular result set’s cell and let you open as an image, text, or Word document. This is helpful for those mega-strings of text that SSMS isn’t really built to view.