Become A Better Speaker

That’s a big title.  Expect “(on the margin)” in tiny font after the title.

What Other People Say

Let me start off with some great resources, and then I’ll move to a few things that have helped me in the past.

Troy Hunt

Troy is a master of preparation.  Check out his prep work he did for NDC.  I don’t remember if he linked in either of these posts, but he will schedule tweets for certain parts of his talk because he knows exactly when he’ll get to that point.  Also check out his speaker anti-patterns and fix them.  Of his anti-patterns, I most hate the Reader.  Please don’t read your slides; your audience can read faster than you can speak, and your audience can read the slides later.  Tell engaging stories, tie things together, and give people something they can remember when they do re-read those slides.  You’ll also enjoy his checklist of the 19 things he needs to do before the talk begins.

What I get from Troy:  Practice, practice, practice.  One thing I want to get better at is taping myself and forcing me to listen to it.  There are verbal tics which you can only find when you listen to yourself.

Brent Ozar

Brent is a top marketer in the SQL Server world.  Here’s some great advice from him on delivering technical presentations.  He’s absolutely right about giving people a link at the end telling them where to get additional materials.  He has additional information on how to handle slides, working with the audience, and getting prepped for the session.  I think his music idea is interesting; at a couple all-day cons we’ve hosted locally, the guy who hosts tends to put on some early morning music before festivities begin.  It lightens up the mood considerably and works a lot better than a bunch of introverts staring quietly at their phones for 40 minutes until someone makes a move to begin.

What I get from Brent:  I’ve cribbed a few things from him.  All of my slide decks have two pieces of information at the end:  “To learn more, go here” and “And for help, contact me.”  The first bit then points to a short form address URL I bought ( and which redirects you to the longer-form address.  On that link, I include slides, a link to my GitHub repo, and additional links and helpful information.  The second bit on contacting includes my e-mail address and Twitter handle.

Paul Randal

Paul has a great Pluralsight course on communication.  This course isn’t just about presentations, but there are two modules, one on writing presentations and one on giving presentations.  I highly recommend this course.

What I get from Paul:  Practice, prepare, and don’t panic.  Compare against my previous review of this course.  Things will inevitably go wrong, so take spare cables and extras of anything you can fit.  At this point, I even carry a mini projector in case of emergency.  I’ve not needed to use it but there might come a day.

Julia Evans

Julia has a fantastic blog post on improving talks and conferences.  I definitely like her point about understanding your target audience.  Her argument in favor of lightning talks is interesting, and I think for beginners, it’s a good idea.  For more experienced speakers, however, 10 minutes is barely an introduction, and sometimes I want a deep dive.  Those have to be longer talks just by their nature.

Another great point she makes is to give hard talks:  aim for beginners to scrape part of it but for experts to learn from it as well.  I absolutely love Kyle Kingsbury’s work too and he helped me get a handle on distributed systems, but in a way that I could re-read his posts several months later and pick out points I never got before.

What I get from Julia:  Find your motivation and make your talks work better for a broader range of people.  I have one talk in particular on the APPLY operator which has the goal of making sure that pretty much anybody, regardless of how long you’ve been in the field, learns something new.  There are a couple of examples which are easier for new users to understand and a couple of examples which are definitely more advanced but still straightforward enough that a new user can get there (even if it does take a little longer).  Ideally, I’d like all of my talks to be that way.

What I Recommend

Here are a few recommendations that I’ll throw out at you.  I’m going to try not to include too much overlap with the above links, as I really want you to read those posts and watch those videos.

  • Practice!  Practice until you know the first five minutes cold.  There are some presenters who will practice a talk multiple times a day for several days in a row.  I’m not one of those people, but if you’re capable of it, go for it.
  • Record yourself.  Find all of those placeholder words and beat them out of yourself.  I don’t mean just “Uh,” “Er,” “Um,” “Ah,” and so on.  In my case, I have a bad habit of starting sentences with “So…”  I’m working on eliminating that habit.  Recordings keep you honest.
  • Tell interesting stories.  To crib from one of my 18th century main men, Edmund Burke, “Example is the best school of mankind, and they will learn at no other.”  Theory ties your work together, and stories drive the audience.  Stories about failure and recovery from failure are particularly interesting; that’s one of the core tenets of drama.
  • Prep and practice your demos.  If you’re modifying anything (databases, settings, etc.) over the course of your demo, have a revert script at the end or revert your VM.  That way, you won’t forget about it at the end of a great talk, give the talk again later (after you’ve forgotten that you never rolled everything back), and have your demos fail because you forgot.  Not that this has happened to me…
  • Speaking of failure, prepare for failure.
    • Have extra cables.  I have all kinds of adapters for different types of projectors.  I have VGA, DVI (though I’ve only seen one or two projectors which required this), and HDMI adapters for both of my laptops in my bag at all times.
    • Prepare to be offline.  If your talk can be done offline, you should do it that way.  Internet connections at conferences are pretty crappy, and a lot of demo failures can be chalked up to flaky networks.  This means having your slides available locally, having your demo scripts available locally, etc.
    • Have your slides and demo scripts on a spare drive, USB stick, or something.  If all else fails, maybe you can borrow somebody else’s laptop for the talk.  I had to do this once.  It was embarrassing, but I got through it and actually got good scores.  The trick is to adapt, improvise, and overcome.  And you do that with preparation and practice.
    • Have your slides and demo scripts available online.  I know I mentioned assuming that your internet connection will flake out, but if your system flakes out and someone lends you a laptop but can’t accept USB sticks (maybe it’s a company laptop), at least you can grab the slides and code online.
    • If you do need an internet connection, have a MiFi or phone you can tether to your laptop, just in case.  If you have two or three redundant internet sources, the chances of them all failing are much lower than any single one failing.
    • Have a spare laptop if you can.  That’s hard and potentially expensive, but sometimes a computer just goes bye-bye an hour before your presentation.
  • Install updates on your VMs regularly.  Do it two nights before your presentation; that way, if an update nukes your system, you have a day to recover.  Also, it reduces the risk that Windows 10 will pick your presentation as the perfect time to install 700 updates.  Very helpful, that Windows 10 is.
  • When in doubt, draw it out.  I have embraced touchscreens on laptops, bought a nice stylus, and love drawing on the screen.  I think it helps the audience understand where you’re going better than using a laser pointer, and sometimes you don’t have a whiteboard.  If you don’t like touchscreens, ZoomIt still works with a mouse.
  • Speaking of which, learn how to use ZoomIt or some other magnification tool.  Even if you set your fonts bigger (which yes, you need to do), you will want to focus in on certain parts of text or deal with apps like SQL Server Management Studio which have fixed-size sections.

There are tomes of useful information on this topic, so a single blog post won’t have all of the answers, but hopefully this is a start.

Spark, R, And Zeppelin

We are going to combine three of my interests today:  Spark, data analysis, and notebooks.  Unlike last week’s discussion of notebooks, I’m going to use Apache Zeppelin today.  In today’s post, we will load some restaurant data into Spark and then build a notebook to visualize some of that data.

Getting Started

If you have not done so, follow through my Getting Started with Spark guide.  This will get restaurant data loaded into Spark.  I’m going to assume that you’ve created a cluster, loaded the restaurant table (but make sure you have the version with no NAs!), and are ready to create a notebook.

Create A Notebook

Our first step will be to create an R notebook.



My goal is to do some of the things that I did in my Touching on Advanced Topics post.  Originally, I wanted to replicate that analysis in its entirety using Zeppelin, but this proved to be pretty difficult, for reasons that I mention below.  As a result, I was only able to do some—but not all—of the anticipated work.  I think a more seasoned R / SparkR practitioner could do what I wanted, but that’s not me, at least not today.

With that in mind, let’s start messing around.


The Struggles

I think it might be more useful to start with the various roadblocks I hit while working on this blog post.  SparkR is a different paradigm, and as someone with a tenuous hold on R, it proved to be a lot harder than I wanted. Here are a few of the things I had trouble with, in the hopes that the solutions will save you a bit of time in the future.

Overwritten Functions

If you install all of the packages from the Jupyter notebook, you’ll find that dplyr has a sql command which will override the Spark SQL command and leave you in a bad place if you don’t know what’s going on.  If you get stuck in this bad place, you can specify the function call, and for Spark, the sql function is SparkR::sql.

Unexpected Results

One of the first struggles I had was that certain operations return data frames rather than vectors, for example, mean.  I was able to use Wendy Yu’s great SparkR tutorial to figure out a way to get mean latitude and longitude from my data set.

Lack of Results

Another major problem I had was that Java is much stricter about data types than R, meaning that when I tried to perform some operation against my restaurants table (in which score is defined as a double), Java would fail with an exception saying that NA is not a valid numeric value.  No matter how I wrote my SQL statement, I was unable to ignore that NA data.  I even switched over to Python, thinking it a SparkR issue, but it turns out to be part of Spark.  I’m sure there’s some flag somewhere that I’m not setting which allows me to ignore those bad values, but I haven’t found it yet.

Data Frames Aren’t DataFrames

Another issue I ran into was when I started using ggplot2.  When I tried to hook up my Spark data frame (which is of type DataFrame), ggplot2 spat back the following error:

Error : ggplot2 doesn't know how to deal with data of class DataFrame

Fortunately, Jose Dianes has an article which discusses this exact problem.  You have to collect the distributed DataFrame before ggplot2 can understand what it is.

The Code

Okay, we spent enough time complaining about my failings; now let’s actually show what we can do.  Unlike the Advanced Topics notebook, I’m going to keep this fairly compact and not meander quite as much as I did (for pedagogical purposes, I promise).

Step one is to load the restaurant data:

restaurantsDF <- sql(sqlContext, "SELECT COUNT(1) AS numberOfResults, AVG(score) AS meanscore, ROUND(lat, 1) AS lat, ROUND(long, 1) AS long FROM restaurants GROUP BY ROUND(lat, 1), ROUND(long, 1)")

Note that in this version, I’m going to specify a narrowed-down query with the relevant attributes and filters I want.  I’ve found this to be preferable to the classic R approach of loading all the data with read.csv and importing dplyr and other tools to filter the data after the fact, particularly because of differences in data frame operations, as we’ll see below.

Step two is to install relevant packages:

install.packages("ggplot2", repos = "")
install.packages("ggmap", repos = "")
print('Finished loading libraries.')

I only need ggplot2 and ggmap because I did my transformations earlier.

Step three is where I calculate latitude and longitude.  My latitude and longitude are technically off slightly because I’m taking the mean of pre-aggregated lat-long combos rather than the mean of the population, but the difference is so slight that it doesn’t make an appreciable difference. <- collect(agg(restaurantsDF, mean=mean(restaurantsDF$lat)))$mean
r.long <- collect(agg(restaurantsDF, mean=mean(restaurantsDF$long)))$mean

With those calculated and collected, I’m able to pass them in as parameters to ggmap’s get_map function, so we can build a Google map.

#Set wakemap to work around the mean latitude and longitude values we calculated above.  We've set the zoom value to 10 so you can see the entire region.
wakemap <- get_map(location = c(lon = r.long, lat =, zoom = 10, maptype = "roadmap", scale = 2)

Once I have those, I need to convert my Spark DataFrame class object to a local R data frame.  I do that by calling the collect method.  Because I only need three of the four attributes, I’ll show off the select function as well.

restgrp <- collect(select(restaurantsDF, "meanscore", "lat", "long"))

This data frame is something ggplot2 can use.  So let’s run ggmap and fill in results:

# Now we can fill in ggmap with settings.  We want three function calls:
# scale_fill_gradient (to give us a visual cue of restuarants.  Pick good colors for low & high.)
# geom_text (to display the meanscore, giving us precise values.  Round meanscore to 1 spot after decimal)
# geom_tile (to display blocks of color.  Set alpha = 1)
ggmap(wakemap) +
  scale_fill_gradient(low="black",high="orange") +
  geom_text(data = restgrp, aes(x=long, y=lat, fill = meanscore, label = round(meanscore, 1))) +
  geom_tile(data = restgrp, aes(x=long, y=lat, alpha=1, fill=meanscore)) 

The end result looks a bit like this:


If you want to get this as a notebook, I’ve made the R script available here.  You can publish Databricks notebooks, but they’re only public for a period of six months and I don’t want to have a broken link in a few months’ time.


Why Use Zeppelin?

Throughout all of this, I didn’t even mention one of the big advantages to using Zeppelin:  the built-in visualization components.  Zeppelin has made it easier for third parties to display graphics, and so visualizing a data set is as simple as running the display command.  For example, here’s a nice-looking histogram of mean restaurant scores by lat-long pair:


That’s a one-line operation, and you can change the plot options to perform other visualizations as well.  This type of visualization is production-worthy, so I give the Databricks team plaudits for making this available to us in such an easy-to-use format.

Final Thoughts

To be honest, I wanted this to be a blog post in which I described how easy the conversion from Jupyter and local R to Zeppelin and SparkR would be.  In reality, there’s a gap between the two.  There are some good articles to help you bridge that gap—I mentioned Wendy Yu’s and Jose Dianes’s—but this certainly is not an exercise in copy-paste; there’s a lot more going on here.  Nonetheless, I recommend getting a handle on SparkR, as the ability to perform data analysis on distributed compute is powerful, especially if you can’t get your boss to approve that desktop with 64 GB of RAM…

R: Touching On Advanced Topics

Wrapping up this mini-series on R, I’m going to link to a notebook which touches on some more advanced topics, especially from the standpoint of a one-hour presentation starting from scratch.  In this last post, I plan to gloss over the details and give you the notebook links and let you see what’s going on.

Notebook Details

This notebook started as a lab assignment for a the Polyglot .NET group in the Raleigh-Durham area.  Nobody got all the way through the lab, but people did get far enough that I’d call the results a success.

Basically, there are two versions of this notebook:  the lab version (which I gave to attendees) and the completed version.  If you feel like digging through the lab for a couple hours, you might get more out of it than simply looking at the completed version.

Basic Ideas

The Data Set

The original data set came from the town of Cary, North Carolina’s open data portal.  This data set includes restaurant inspections over the course of approximately one decade.  It includes some very interesting details, but is missing one huge part:  it has address information, but no latitude and longitude information.

My original plan to take address data and get latitude and longitude was to hit the Google Maps API, which offers 1000 free requests per day.  1000 hits per day is a huge number most days, but when you have over 13,000 requests, it’s not that great an option.  As a result, I ended up using the street2coordinates function inside RDSTK.  This is a bit slower than Google Maps and didn’t have the nice geocoding methods that ggmap has for Google Maps, but I was able to let the process run overnight and get latitude and longitude details.  From there, I saved the file as a text file and make it available for general use.

The Process

This notebook takes you through my version of data analysis, including some false starts thrown in to help you understand.  I like examples that have built-in failure points because we see them in real life far more often than successes, and so I want to show how I react after hitting a brick wall, ways that I got around the issue at hand, and that in turn helps you later on when you hit a similar wall.

Cool Things We Do

This notebook is full of things that I consider really nice about R:

  • Using ggmap to plot data on real Google Maps images, letting users gain immense context
  • Cleanse data using dplyr, including filtering results, grouping results, and creating new summary variables that we can plot on a map.
  • Make area plots with mean restaurant scores for the whole of Wake County as well as the town of Cary.


This post wraps up my R mini-series.  It barely scratches the surface of what you can do with R, but I’m hoping it inspires SQL Server developers to get in on the data analysis game.

R: Database Access

In today’s post, I’m going to use RODBC to connect to SQL Server and perform a few operations.


There are a couple of steps we’re going to have to do in order to get RODBC to work with SQL Server.

Grab The Notebook

The first step is to grab the Basic SQL Server Connection notebook from my GitHub repo.  We’ll follow along with this notebook throughout the blog post.

Grab The Database Backup

If you want to follow along, I have a backup of PASSData which you can download from the GitHub repo.  It’s about 20 MB and does require SQL Server 2016.  If you don’t want to download this file or don’t have SQL Server 2016 installed, you can still get most of the benefit from this notebook by using your own DSN and writing your own queries.

Set Up Your DSN

If You’re On Linux

First, if you’re running on Linux, you’ve got a bit of work to do.  For this post, I’m going to assume that you’re running Windows, but if you are on Linux, follow the instructions above and you should be able to follow along.  You’ll probably have to change the connection to use SQL authentication instead of Windows authentication, but that’s a minor one-line change in the conn <- obdcConnect("PASSDATA") block.

If You’re On Windows

Before you begin connecting to a SQL Server instance, you need to set up an ODBC connection. To do this in Windows, go to the ODBC Data Sources application and create a SYSTEM-level data source.

Once you have that data source loaded, you can use RODBC to connect to SQL Server using that DSN. In this case, I created a data source named “PASSData” which connects to a local SQL Server instance’s PASSData database.

Learning What RODBC Can Do

First, we need to get RODBC from CRAN.  RODBC is a package which allows you to connect to ODBC-compliant databases using R.  This package allows you to connect to a wide variety of database products, which has its positives and negatives.  The big positive is that you don’t need to use SQL Server on the back end:  you can set up an Oracle, MySQL, Postgres, or a large number of other products.  The big negative is that this means less effort can be put into vendor-specific extensions.  RODBC is not a least-common-denominator product, but won’t be optimized for every product’s use cases.

We install RODBC with the following commands:

install.packages("RODBC", repos = "")

If you’d like more details on RODBC, I recommend checking out the documentation.

RShowDoc("RODBC", package="RODBC")

RODBC provides the odbcConnect function, which lets you connect to a database using a DSN or a connection string.  We’re going to use the DSN we created above.

Data Retrieval

We have a couple methods for getting data.  First, we can get metadata using built-in methods.  Here’s how to get a list of all tables:

sqlTables(conn, schema = "dbo")

Once you have that, you can easily grab all of the rows in a table using the sqlFetch command:

registrations <- sqlFetch(conn, "dbo.SQLSatRegistrations")

If you want a more fine-grained query or want to write SQL to perform some of the early data cleansing, write a sqlQuery:

sessions <- sqlQuery(conn, paste("SELECT TOP(50) EventName, EventDate, SessionName",
                                 "FROM dbo.SQLSatSessions",
                                 "WHERE SatNum = 217",
                                 "ORDER BY SessionName DESC"))

Other Operations

You can also insert and update data, create tables, and manage tables using RODBC, but for the most part, aside from inserting cleansed data, I’d personally shy away from using a data analysis tool for database management.

At any rate, once you’re done with the connection, close it using the close command:



Given the continued immense popularity of relational databases, it makes sense to know how to connect to a relational database using R.  RODBC is just one package which is up to the task—RJDBC and RSQLServer are two others.  If you find yourself having performance problems with RODBC on SQL Server, try the others.

R: The Basics Of Notebooks

This blog post is in anticipation of SQL Saturday Columbus, in which I’m going to give a talk introducing R to SQL Server developers.  My primary vehicle for explaining R will be notebooks.

Notebook Basics

I’ll start with two big questions:  what are notebooks and why should we use them?

Notebooks Are…

Remember chemistry class in high school or college?  You might remember having to keep a lab notebook for your experiments.  The purpose of this notebook was two-fold:  first, so you could remember what you did and why you did each step; second, so others could repeat what you did.  A well-done lab notebook has all you need to replicate an experiment, and independent replication is a huge part of what makes hard sciences “hard.”

Take that concept and apply it to statistical analysis of data, and you get the type of notebook I’m talking about here.  You start with a data set, perform cleansing activities, potentially prune elements (e.g., getting rid of rows with missing values), calculate descriptive statistics, and apply models to the data set.

Why Should We Use Them?

There are two separate reasons why we should use notebooks.  The first ties back to the reason you used them in chem lab:  the potential for independent verification.  If I want to publish a study—even if it’s just an internal analysis of company metrics—I want another analyst to be able to follow what I did and ensure that I made correct choices.

Aside from independent validation, there is a second big reason to use notebooks:  cross-language compatibility.  If your organization only uses R to perform data analysis, a notebook might not be necessary; you can get by with well-documented and thorough R scripts.  But as soon as somebody wants to write Python or Julia code, the classic technique starts to get tricky.  Notebooks help you organize analyses and write code across a number of languages.

In addition to multi-lingual capabilities, notebooks make documentation easier.  The major notebooks support Markdown, an easy way of formatting text.  This allows you to create pretty and thorough documentation.  I recently went through an EdX course on Spark, and their labs were amazing.


It’s hard to do justice to their labs in a single screenshot because of how thorough they were, but you can get the gist of what Markdown can do here:  separate sections, headings, images, code snippets, and a large number of other features without having to write a mess of HTML.

Which Notebook To Use?

There are two major notebooks out there today:  Zeppelin and Jupyter.  I’ve used both and like both.  Hoc Q. Phan has a good comparison of the two, and I’d say it’s fair:  they’re both pretty close in terms of quality, and your particular use case will probably determine which you choose.

For the purpose of my talk, I’m going to use Jupyter, as that’s something my attendees can install pretty easily.  I also have blog posts on installing it for Windows and Linux.

Working With Jupyter

Running Jupyter is easy:  open up a command prompt and type in “jupyter notebook” after you’ve installed Jupyter.  That dumps me in my home directory on Windows, so I’m going with that and storing my notebooks in Documents\Notebooks.


If you want to create a new notebook, it’s as easy as hitting the New button:


Running An Existing Notebook

In my case, I’m going to open the Basic R Test workbook, which you can download for yourself if you’d like.  Clicking on a notebook brings it up in a new tab:


My notebooks are nowhere near as beautiful as the Spark course notebooks, but they do get the point across at least.  If you want to modify existing Markdown, double-click on the section of text you’d like to edit:


This changes the pretty text into a Markdown editor block.  When you’re done making changes, press Shift+Enter to run the block.  Running Markdown blocks simply makes them display prettified HTML and moves the focus to the next block.  The next block is a code block.


The code here is extremely simple:  we’re going to create a vector with one element (an integer whose value is 500) and assign it to y.  Then, we’re going to print the contents of y.  To run the block, again, press Shift+Enter or you can hit the Play button instead.


The results of the code block appear on-screen.  One of the nice things about notebooks is that you can include any R outputs as part of the output set.  To show you what I mean, I’m going to select the Run All option from the Cell menu.


This will run all of the cells again, starting from the top.  I could select “Run All Below” and skip re-assigning y, but in this case, it doesn’t hurt anything.  After running everything, I can scroll down and see this section of results:


There are two separate command here.  First, I’m running a head command to grab the first 3 elements from the set.  Second, I’m running a plot command to plot two variables.  In neither of these cases do I spend any time on formatting; the notebook does all of this for me.  That’s a major time savings.

Inserting New Cells

Inserting new cells is easy.  If you want to insert a new cell at the bottom of a notebook, start typing in the cell at the bottom of the page.


You can define what type of cell it is with the cell type menu:


In this case, I’ve selected Markdown, so this will be a commentary cell.  I fill in details and when I’m done, hit Shift+Enter to run the cell.  But let’s say that you want to insert something in between two already-existing cells.  That’s easy too:  in the Insert menu, you have two options:  Insert Cell Above or Insert Cell Below.


Notebooks are powerful tools for data analytics.  I wouldn’t use a notebook as my primary development environment, though—for R, I’d write the code in RStudio and once I have my process complete, then I’d transfer it to a notebook, add nice Markdown, and make it generally understandable.

The rest of this week’s R posts will use notebooks to walk through concepts, as I think that this is easier to understand than giving somebody a big script and embedding comments (or worse, not even including comments).

Pluralsight Reviews: Scala

I recently had the chance to go through Justin Pihony’s Pluralsight course on Scala.  If you want to write Spark code using Scala, this is a good starter course.

The entire premise of the course is to write a relatively simple file searcher, but Justin takes you through the entire process, including getting build tools working, building (and automatically running) unit tests, and developing code to meet evolving requirements.

I particularly appreciated his introduction of FlatSpec and writing tests like the following:

"Matcher that is passed a file matching the filter" should
  "return a list with that file name" in{
    val matcher = new Matcher("fake", "fakePath")
    val results = matcher.execute()
    assert(results == List(("fakePath", None)))

I might never have learned about FlatSpec if it weren’t for this course, so I appreciate that.

DevTeach Review

I had the pleasure last week of attending and speaking at DevTeach.  This year’s DevTeach was in downtown Montreal, and here are a few scattered, unstructured thoughts on the event:

  • On day 1, Itzik Ben-Gan gave a talk on advanced SQL Server techniques, and I found one technique so interesting that I had to blog about it.  About 70% of his presentation was typing, and that takes chutzpah.
  • One of the overarching themes of this year’s DevTeach is DevOps.  Dylan Smith gave a great talk on the concept, as did Julien Stroheker.  There were a few other sessions scattered through the day which hit on the topic as well, and on Tuesday night, the DotNetRocks guys did a panel discussion on DevOps, what it means, and what the future holds.
  • I was impressed at how interested many of the attendees were in security.  Security and developers don’t typically go together very well, but there were several security-focused talks (including one of mine).  I think part of this goes back to the DevOps theme:  as developers get more intertwined into administrative affairs, they (hopefully) become more aware of the security ramifications of their decisions.  But maybe that’s the optimism talking…
  • Joel Hebert gave a couple of great enterprisey talks, one on logging and the other on transient fault handling.  I loved the logging talk, as it hit many really important points.  For example, Joel made mention of logging anything and everything; specifically, log incoming payloads, outgoing payloads, and any mappings (transformations) which occur during the process.  This results in a lot of writes, but gives you enough information to debug without running a debugger or trying to step through code.  If you log appropriately, you might even get the ability to pick up where you left off after you fix the underlying issue, which could save a huge amount of time when dealing with large processes.  I’ve been hitting similar topics at work lately, so this talk resonated with me; I ended up taking about a page of notes.  And right after that talk, he gave another one on transient fault handling, looking at retry logic.  His main theme was that you have to be smart about retries:  it doesn’t make sense to retry a persistent error, but it also doesn’t make sense to give up when there’s a blip, so catch specific exceptions (like timeouts), try again with a custom backoff strategy (e.g., 1 second, then 3 seconds, then 30 seconds, then 90 seconds), and implement the circuit breaker pattern.
  • I’ll end on a lighter note.  Because DevTeach is in Montreal, some of the sessions were in French and a lot of the speakers are native French speakers.  I can pick out a word here and there, but make no pretense of understanding French.  On days 2 and 3, both morning speakers started out in French even though their abstracts were in English and DevTeach had a specific code to show which sessions were to be in French.  Fortunately, they both switched to English very early on, so I was able to follow along and learn a bit on a couple of interesting topics.

I’d love to hit DevTeach again next year, so I’ll have to keep an eye out for next year’s CFP.