TIL: Docker

Last night, I went to a local .NET User Group meetup and got my first taste of Docker.

In my case, I ended up running on Elementary OS rather than Windows, but the experience was a good one, going through the tutorials. In the end, I installed Solr and was able to load a document for indexing.  Installing nginx was also easy.

If you are interested in installing Docker in a Windows environment, start with Mano Marks’s article on the topic. You can also look at the Docker Tools for Visual Studio.

TIL: Kafka Queueing Strategy

I had the privilege to sit down with a couple Hortonworks solution engineers and discuss a potential Hadoop solution in our environment.  During that time, I learned an interesting strategy for handling data in Kafka.

Our environment uses MSMQ for queueing.  What we do is add items to a queue, and then consumers pop items off of the queue and consume them.  The advantage to this is that you can easily see how many items are currently on the queue and multiple consumer threads can interact, playing nice with the queue.  Queue items last for a certain amount of time—in our case, we typically expire them after one hour.

With Kafka, however, queues are handled a bit differently.  The queue manager does not know or care about which consumers read what data when (making it impossible for a consumer to tell how many items are left on the queue at a certain point in time), and the consumers have no ability to pop items off of the queue.  Instead, queue items fall off after they expire.

Our particular scenario has some web servers which need to handle incoming clicks.  Ideally, we want to handle that click and dump it immediately onto a queue, freeing up the web server thread to handle the next request.  Once data gets into the queue, we want it to live until our internal systems have a chance to process that data—if we have a catastrophe and items fall off of this queue, we lose revenue.

The strategy in this case is to take advantage of multiple queues and multiple stages.  I had thought of “a” queue, into which the web server puts click data and out of which the next step pulls clicks for processing.  Instead of that, a better strategy (given what we do and our requirements) is to immediately put the data into a queue and then have consumers pull from the queue, perform some internal “enrichment” processes, and finally put the enriched data back onto a new queue.  That new queue will collect data and an occasional batch job pulls it off to write to HDFS.  This way, you don’t take the hit of streaming rows into HDFS.  As far as maintaining data goes, we’d need to set our TTL to last long enough that we can deal with an internal processing engine catastrophe but not so long that we run out of disk space holding messages.  That’s a fine line trade-off we’ll need to figure out as we go along.

Summing things up, Kafka is quite a different product than MSMQ, and a workable architecture is going to look different depending upon which queue product you use.

TIL: Bots

Last night, Jamie Dixon (whose book you should buy) talked about his experience at Build this year.  His main takeaway is that Microsoft is pushing their Bot Framework pretty hard.  Jamie showed how to create a stock lookup bot and deploy it to Slack using F# code and an Azure account.

About a month ago, I went to F8 and my main takeaway from that conference is that Facebook is pushing bots on their Messenger platform pretty hard.

Based on my (extremely) limited knowledge of both, it seems that the Facebook bot platform is a bit easier to deal with, as there’s a user interface for writing messages and tokenizing, but right now, Microsoft’s platform is a bit more customizable and developer-friendly.  What’s particularly interesting about both of these is that the analytics and intelligence engines are both closed-source—neither company will let you see the wizard behind the curtain.

TIL: Auditing, Monitoring, Alerting

I’m giving a presentation on monitoring this Monday.  As part of that, I want to firm up some thoughts on the differences between auditing, monitoring, and alerting.  All three of these are vital for an organization, but they serve entirely different functions and have different requirements.  I’ll hit a bunch of bullet points for each.


Auditing is all about understanding a process and what went on.  Ideally, you would audit every business-relevant action, in order, and be able to “replay” that business action.  Let’s say we have a process which grabs a flat file from an FTP server somewhere, dumps data into a staging table, and then performs ETL and puts rows into transactional tables.  Our auditing process should be able to show what happened, when.  We want to log each activity (grab flat file, insert rows into staging table, process ETL) down to its most granular level.  If we make an external API call for each row as part of the ETL process, we should log the exact call.  If we throw away a row, we should note that.  If we modify attributes, we should note that.

Of course, this is a huge amount of data and depending upon processing requirements and available storage space, you probably have to live with something much less thorough.  So here are some thoughts:

  • Keep as much information around errors as you can, including stack traces, full parameter listings, and calling processes.
  • Build in (whenever possible) the full logging mentioned, but leave it as a debug/trace flag in your app.  You could get creative and have custom tracing—maybe turn on debugging just for one customer.  You might also think about automatically switching that debug mode back off after a certain amount of time.
  • Add logical “process run” keys.  If there are three or four systems which process the same data in a pipeline, it makes sense to track those chunks of data separate from the individual pipeline steps.  At an extreme case, you might want to see how an individual row in a table somewhere got there, with a lineage ID that traces back to specific flat files or specific API calls or specific processes and tells you everything that happened to get to that point.  Again, this is probably more of an ideal than a practical scenario, but dream big…
  • Build an app to read your audit data.  Reading text files is okay, but once you get processes interacting with one another, audit files can get really confusing.


Monitoring is all about seeing what’s going on in your system “right now.  You want nice visualizations which give you relevant information about currently-running processes, and I put “right now” in quotation marks because you can be monitoring a process which only updates once every X minutes.

There are a couple of important things to consider with monitoring:

  • Track what’s important.  Don’t track everything “just in case,” but focus on metrics you know are important.  As you investigate problems and find new metrics which can help, add them in, but don’t be afraid to start small.
  • Monitoring should focus on aggregations, streams, and trends.  It’s your 50,000-foot view of your world.  Ideally, your monitoring system will let you drill down to more detail, but at the very least, it should let you see if there’s a problem.
  • Monitors are not directly actionable.  In other words, the purpose of a monitor is to display information so a human can observe, orient, decide, and act.  If you have an automated solution to a problem, you don’t need a monitor; you need an automated process to fix the issue!  You can monitor the automated solution to make sure it’s still running and track how frequently it’s fixing things, of course, but the end consumer of a monitor is a human.
  • Ideally, a monitor will display enough information to weed out cyclical noise.  If you have a process which runs every 60 minutes and which always slams your SAN the top 5 minutes of each hour, maybe graph the last 2 or 3 hours so you can see the cycles.  If you have enough data, you can also build baselines of “normal” behavior and plot those against current behavior to make it easier for people to see if there is a potential issue.
  • Monitors are a “pull” technology.  You, as a consumer, go to the monitor application and look at what’s going on.  The monitor does not jump out and send you messages and popups and try to get your attention.


Alerting is all about sending messages and getting your attention.  This is because an alert is telling you something that you (as a trained operator) need to act upon.  I think alerting is the hardest thing on the list to get right because there are several important considerations here:

  • Alerts need to be actionable.  If I page the guy on call at 3 AM, it’d better be because I need the guy on call to do something.
  • Alerts need to be “complete.”  The alert should provide enough information that a sleep-deprived technician can know exactly what to do.  The alert can provide links to additional documentation, how-to guides, etc.  It can also show the complete error message and even some secondary diagnostic stuff which is (potentially) related.  In other words, the alert definitely needs to be more than an e-mail alert which reads “Error:  object reference not set to an instance of an object.”
  • Alerts need to be accurate.  If you start throwing false positive alerts—alerting when there is no actual underlying problem—people will turn off the alert.  If you have false negatives—not alerting when there is an underlying problem—your technicians are living under a false sense of security.  In the worst case scenario, technicians will turn off (or ignore) the alerts and occasionally remember to check a monitor which lets the know that there was a problem two hours ago.
  • Alerts need human intervention.  If I get an alert saying that something failed and an automated process has kicked in to fix the problem, I don’t need that alert!  If the automated process fails and I need to perform some action, then I should get an alert.  Otherwise, just log the failure, have the automated process run to fix the problem, and let the technicians sleep.  If management needs figures or wants to know what things looked like overnight, create reports and digests of this information and pass it along to them, but don’t bother your technicians.
  • On a related note, alerts need to be for non-automatable issues.  If you can automate a problem away, do so.  Even if it takes a fair amount of time, there’s a lot less risk in a documented, tested, automated process than in waking up some groggy technician.  People at 3 AM make mistakes, even when they have how-to documents and clear processes.  People at all hours of the day make mistakes; we get distracted and miss steps, mis-type something, click the wrong button, follow the wrong process, think we have everything memorized but (whoopsie) forgot a piece.  Computers are less likely to have these problems.

Wrapping Up

Auditing, monitoring, and alerting solve three different sets of problems.  They also have three different sets of requirements for what kind of data to use, how frequently to refresh this data, and how people interact with them.  It’s important to keep these clearly delineated for that reason.

During this, I’m also working on some toy monitoring stuff, so I hope that’ll be tomorrow’s TIL.

TIL: Installing Jupyter And R Support

I recently got through some difficulties installing Jupyter and incorporating R support, so I wanted to write up a quick installation post for a Linux installation.

First, install Jupyter through anaconda.  Notes:

  1. I grabbed the Python 3.5 version.  I don’t intend to write too much Python code here, so it shouldn’t make a huge difference to me.
  2. When you install, do not run sudo.  Just run the bash script.  It will install in your home directory by default, and for a one-off installation on a VM (like my scenario), this is fine.  It also makes future steps easier.

When running Jupyter, I started by following this guide.  Notes:

  1. Starting Jupyter is as easy as running “jupyter notebook” and navigating to http://localhost:8888 in a browser.
  2. Midori does not appear to be a good browser for Jupyter; when I tried to open a new notebook, I got an error message in the console and the browser seemed not to want to open up the notebook.  Firefox worked just fine.  Maybe I’m doing this wrong, though.

As for installing R support, Andrie de Vries has a nice post on the topic.  Notes:

  1. Here’s where not running sudo above pays off.  If you did run sudo, you’ll get an error saying that you can’t install in the home directory and that you should run a command to make a copy of the files…in your home directory.  If you accidentally ran sudo, you can chmod all of the files in your anaconda3/ directory using “chmod -R [user]:[user] anaconda3/” and correct the issue.
  2. Installation is as simple as running “conda install -c r ipython-notebook r-irkernel”  Again, note that I’m not running sudo here.

Migrating A Virtual Machine To Azure

My Demo Machine

I have a reasonably powerful laptop that I use for demos.  I also use VMware Player because it’s free, because I have experience with VMware, and because I want virtual machines for demos.  That way, I can keep my demonstration environment fairly well controlled.  I don’t upgrade my demo machines that often, and when I do, I’m reasonably careful about it.  This also allows me to repeat my demonstrations without too much bother, and it also means that my futzing about with other work doesn’t affect my ability to demo presentations.

My Mistake

Just about two months ago, I had disaster befall me at SQL Saturday Pittsburgh, when my laptop and the provided projector absolutely would not play nice.  I had a tablet with me, but there’s no way my little tablet would be able to power SQL Server, even if I had it installed.  What that tablet can do, though, is run a VM.  I also have an Azure subscription, so I decided that one of my many safety measures would be to migrate my demo VM up to Azure so that I could spin up a VM in the event of future disaster.


There are two approaches that will work:  the Microsoft Virtual Machine Converter and manually uploading VMs.  I’ll walk through each.

Microsoft Virtual Machine Converter

There are a few good resources on how to get MVMC working.  I started with Carsten Lemm’s blog post on the topic because I wanted to migrate a VMware VM into Azure and I could afford to spend 30 minutes on the task.  Sadly, my experience took well over 30 minutes…

After downloading and installing the MVMC executable, I followed Carsten’s instructions and made sure that my VM obtains DNS and IP addresses automatically rather than hard-coding addresses.  I also turned on Remote Desktop.  At that point, Carsten’s blog post is a bit out of date, as he references an executable which no longer exists.  But that’s okay, because the MVMC 3.0 executable now has a nice wizard.  The route is pretty simple: on the first screen, we want to select the “Virtual machine conversion” radio button.  Then, select the “Migrate to Microsoft Azure” option and hit Next again.  The next tab asks you for a Subscription ID and Certificate Thumbprint.

To get the Subscription ID, you can go to the old Azure portal and click the Settings tab on the left-hand navigation bar.  That will give you a GUID which represents your subscription ID.  Copy and paste that into the Subscription ID section and you will find a bug:  this page has an off-by-one error.  If you paste in your subscription ID, you’ll see that the last character is missing and the app does not allow you to type in that last character.  Even if you delete characters, you’re still stuck.  The only way I was able to get past this is to type in my GUID manually.

As for the Certificate Thumbprint, the same applies:  I needed to type it in manually or else the app would cut off part of the thumbprint.  Don’t type in any of the spaces and you’ll be fine.  If you don’t know how to create a certificate, check out the Additional Resources section.

From here, I’m going to cut out because the next screen ended up being my downfall:  they want me to put in my VCenter, ESX, or ESXi server name.  I don’t have one of those; I’m just using VMware player and want to convert my VMDK files to VHDs so Azure can use them.  I realized at this point that MVMC was not the trick.  The only reason I’m including this section is to point out the bug above, just in case anybody gets errors like The certificate with thumbprint [thumbprint] was not found in the personal certificate store.  If you know that you’re copy-pasting correctly but it’s still giving you that error, type the thumbprint out and see how that goes.

Manual VM Upload

From here, I decided to cut my losses and start over without MVMC.  The first step is that you need to run sysprep on the VM.  If you don’t do this, the image will fail to provision and you might not be able to use the Azure VM you create.  Sysprep is available in the Windows\System32\Sysprep directory, and has a GUI.  In my case, I decided to copy my VM folder to ensure that it didn’t mess up the local copy of my demo box VM.  The last thing I want is for my Azure copy to mess up my local copy.  Anyhow, run sysprep.exe to begin sysprep.

Once sysprep’s GUI appears, select the “Enter System Out-of-Box Experience (OOBE)” drop-down option and check the “Generalize” checkbox, and then select the “Shutdown” drop-down menu item from the Shutdown Options list.  Let sysprep do its thing, and then it’ll shut down your VM.

Once sysprep was done, I needed to find a way to get the VMDK files converted to VHDs.  A blog post turned me on to StarWind Software’s V2V Converter.  It’s a free tool which allows you to convert virtual hard drive files from one format to another.  Installing this tool let me turn my set of VMDKs into one 45GB VHD.  One note is that, at least on my machine, I needed to run the V2V Converter from a command prompt; executing the app directly from the Start menu would cause the app to appear for a moment and then disappear, as though some error killed the program.  The tool installs by default in “%programfiles(x86)%\StarWind Software\StarWind V2V Image Converter\StarV2V.exe” From there, I just needed to get that big image into Azure.

Make sure that you have the Microsoft Azure Powershell cmdlets.  Then, follow these instructions to connect your Azure subscription to the local machine and upload your VHD using the Add-AzureVhd cmdlet.  Make sure that you have a Storage object and a Blob container, as that’s where you’re going to store the VHD file from which you’ll make a Virtual Machine image.

Once you have that image uploaded, you can create a new Azure VM from an image.  Select the “MY IMAGES” option and you can pick your demo image.  It’ll take a while for the VM to be provisioned.  Also, don’t forget that you’re going to be charged for that VM as long as it’s running, so if you’re not using it, turn that sucker off.


I wanted a nice and easy GUI that let me tell an app where my VMDK files were and let it do all of the preparation, conversion, uploading, image creating, and provisioning.  You aren’t going to get that.  For these types of one-off scenarios, I accept (but am not happy with) the second approach listed.  If I were doing this in an enterprise environment, there’d be a lot more Powershell.

Additional Resources


Manual VM Upload