I ran into a really frustrating error around using Key Vault in Spark. In this post, I'll cover the issue, what causes it, and how you can correct it. The Issue: A Misleading Timeout I want to use Azure Synapse Analytics with data exfiltration protection enabled. One of the main consequences of data exfiltration protection…
Data Exfiltration Protection and Pip
Here is a warning to anybody using data exfiltration protection in Azure Synapse Analytics. Hopefully it saves you some valuable time in troubleshooting. The Setup I have an Azure Synapse Analytics workspace which uses a managed virtual network and includes data exfiltration protection. I also have a Spark pool. My goal is to import a…
Azure Synapse Analytics with Managed VNet + DEP
The Azure Synapse Analytics Managed Virtual Network If you've ever created an Azure Synapse Analytics workspace, you may recall seeing an entry about managed virtual networks. I'm pretty sure virtual networks are what we see in sci-fi movies from the '90s. The managed virtual network option means that any server resources within Azure Synapse Analytics…
Book Review: Learn Azure in a Month of Lunches (2nd Edition)
This is a review of Iain Foulds's Learn Azure in a Month of Lunches, Second Edition (Manning, 2020). To set up my review, you can't learn Azure in a month of lunches. You can't learn Azure in a year of lunches. And cloud services change so quickly that by the time a book comes out,…
Generating Sometimes-Anomalous Data
In yesterday's post, I needed to set the stage by showing how to create data following a normal distribution, or at least close enough to normal. I'll first show off the version in C# that I built for a Microsoft Cloud Workshop, and then follow up with a version in F# which shows off a…
Estimating a Random Distribution in .NET
Today's a fairly short post, all about building a random distribution in .NET, both C# and F#. There are a couple of ways to do it, one which involves taking a reference and one which does not. Taking a Reference: the Math.Net Approach The Math.Net library is an excellent library for anybody doing serious numeric…
Book Review: Principles of Gestalt Psychology
This is a review of Kurt Koffka's Principles of Gestalt Psychology, written in 1935 and re-published a bunch of times, including my 2014 edition. Koffka was one of the three big players in the Berlin school of Gestalt psychology. In case you're not familiar with it, the real brief run-down is that the Gestalt school…
Interesting Resources for Chapter 3
For each chapter in Finding Ghosts in Your Data, I’ll include a few resources that I found interesting. This isn’t a bibliography, strictly speaking, as I might not use all of these in the course of writing, but they were at least worth noting. Articles Chapter three of the book is all about the process of…
Indexes, Distributions, and Partitions in Dedicated SQL Pools
Not too long ago, I ended up taking the DP-203 certification exam for sundry reasons. On that exam, they ask a lot about Azure Synapse Analytics, including indexing, distribution, and partitioning strategies. Because these can be a bit different from on-premises SQL Server, I wanted to cover what options are available and when you might…
An Extended Movie Review: M
Let me start out by saying that Fritz Lang's M is my favorite movie of all time, so of course I'm going to recommend it. Go see it now because I'm going to spoil it like crazy. I own the Criterion Collection's version of M on Blu-Ray, but you can also rent it on Amazon…