I’m taking a short break from my Polybase series to start a series on setting up a Hadoop cluster you can put in a laptop bag. For today’s post, I’m going to walk through the hardware.
My idea for my on-the-go cluster hardware comes from an Allan Hirt blog post. After seeing how powerful the Intel NUC NUC6i7KYK, which features a quad-core i7 processor and the ability to slot in 32 GB of RAM and two NVMe hard drives. It’s also small, barely larger than a DVD case in length and width, and about an inch thick. This thing fits easily in a laptop bag, and the power supply brick is about the size of a laptop power supply, so it’s not tiny but it’s still portable. Note that if you want this to be an on-the-go device, you’ll need a stable source of power; it doesn’t have a laptop battery or any other built-in way to keep power going when it’s unplugged.
The Purchases List
Here’s what I ended up buying. It’s not the same as Allan’s kit, but it does the job for me:
- Intel NUC NUC6i7KYK
- Crucial 32 GB DDR4 2133 MT/s for RAM
- Samsung 850 EVO hard drive
The NUC is a bare-bones kit with a case, motherboard, and CPU. You need to buy and install RAM and hard drive, and you can also install an external video card if you’d like. Of course, you’ll need a monitor, keyboard, and mouse, but I had those lying around.
Preparations
I decided to install Ubuntu 16.04 LTS as my base operating system. It’s a reasonably new Ubuntu build but still stable. Why Ubuntu? Because Docker. Yes, Windows Server 2016 has Docker support, but I’ll stick with Linux because I have appropriate images and background with the operating system to set it up alright. If you’re comfortable with some other Linux build (CentOS, Red Hat, Arch, whatever), go for it. I’m most comfortable with Ubuntu.
I also had to grab the latest version of the NUC BIOS. You can read the install instructions as well.
Where We’re Going From Here
On Wednesday, I’m going to walk us through setting up Docker and putting together the image. On Friday, I’ll install a 5-node Hadoop cluster on my NUC. I have a couple more posts planned out in the Hadoop series as well, so stay tuned.
2 thoughts on “Let’s Build A Hadoop Cluster, Part 1”