What I’m Reading: Programming Hive

The problem with writing a book about a platform like Hadoop is that as soon as the book gets published, the material is already outdated.  Programming Hive is just two years old and already has that problem.  What it has going in its favor is a comprehensive look at how Hive works (for the most part, given how much has changed since its publication).  I enjoy how the authors show Hive as a lot more than a simple SQL interface to Hadoop.  Being able to create and maintain indexes, partition tables, and introduce Java user-defined functions give this language a lot of power.  Pair it with a Hadoop platform (my favorite is the Hortonworks sandbox, but go with your preference if you have one) and run with it.  Unfortunately, a lot of the examples won’t work exactly as written due to changes in the language, but when you run into those, you can either skip them or find out how to do it with a more recent version of Hadoop and Hive.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s