In yesterday’s post, I needed to set the stage by showing how to create data following a normal distribution, or at least close enough to normal. I’ll first show off the version in C# that I built for a Microsoft Cloud Workshop, and then follow up with a version in F# which shows off a really cool feature in the language.

## anomalies in C#

As we saw last time around, the Box-Muller transform function is pretty simple to put together in C#:

private static double BoxMullerTransformation(Random rand, double mean, double stdDev)
{
double u1 = 1.0 - rand.NextDouble();
double u2 = 1.0 - rand.NextDouble();
double randStdNormal = Math.Sqrt(-2.0 * Math.Log(u1)) * Math.Sin(2.0 * Math.PI * u2);
return mean + stdDev * randStdNormal;
}

Once I have that in place, I can create another function which generates a value. Most of the time, it will pull from a normal distribution. But sometimes, it will pop out an anomaly:

private static double GenerateDoubleValue(Random rand, double mean, double stdDev, double likelihoodOfAnomaly = 0.06)
{
double res = BoxMullerTransformation(rand, mean, stdDev);

double u1 = rand.NextDouble();

if (u1 <= likelihoodOfAnomaly / 2.0)
{
// Generate a negative anomaly
res = res * 0.6;
}
else if (u1 > likelihoodOfAnomaly / 2.0 && u1 <= likelihoodOfAnomaly)
{
// Generate a positive anomaly
res = res * 1.8;
}

return res;
}

First up, we get a random number following our normal distribution explained by mean mean and standard deviation stdDev. Then, we generate a double in the interval (0-1) based on a uniform distribution (Random.NextDouble()). We’ll use that uniformly distributed random number to determine what we send back. If I stick with the default parameter of 0.06 for likelihood of anomaly, 94% of the time, we send back res itself. If we simplify the problem with a 100-sided die, if I roll anything 7 or higher, it’s going to give me res with no modifications. If I roll a 1-3, I get a negative anomaly: 60% of the value of res, whatever it is. If I roll a 4-6, I get a positive anomaly: 180% of the value of res. If I run this a few hundred times, it’s easy to see the normal outcomes but we can also catch some oddities.

All in all, it’s about 30 lines of C# code and pretty reasonable. But I think we can do better with F#.

## Anomalies and Active Patterns in F#

F# has a really neat thing called active patterns. Check out the documentation or this post by Brad Collins, but even in a simple example like the one I’m about to show, you can see some of the benefit of the pattern.

We will start off just like we did before, with the Box-Muller transform function:

let boxMullerTransformation (rand:Random) mean stdev =
let u1 = 1.0 - rand.NextDouble()
let u2 = 1.0 - rand.NextDouble()
let randStdNormal = Math.Sqrt(-2.0 * Math.Log(u1)) * Math.Sin(2.0 * Math.PI * u2)
mean + stdev * randStdNormal

But then, we’re going to create an active pattern:

let (|AnomalyLow|Normal|AnomalyHigh|) (input, p) =
match input with
| x when x <= p / 2.0 -> AnomalyLow
| x when x > p / 2.0 && x <= p -> AnomalyHigh
| _ -> Normal

What I’m doing here is assigning three possible states of the world: you drew a low anomaly, a high anomaly, or a normal value. We take in a tuple of two inputs: the actual input itself (u1 in our C# code) and the probability of anomaly (likelihoodOfAnomaly in the C# code). We then perform pattern matching. If the value of the input, which I’ve relabeled as x in the patterns, is less than or equal to the likelihood of anomaly divided by two, we’ll call that a low anomaly. If it’s greater than 1/2 of the likelihood but still less than or equal to the likelihood of an anomaly, we call that a high anomaly. Otherwise, we’re in the normal state.

With this pattern, my double generator becomes:

let generateDoubleValue (rand:Random) mean stdev likelihoodOfAnomaly =
let res = boxMullerTransformation rand mean stdev
let u1 = rand.NextDouble()
match (u1, likelihoodOfAnomaly) with
| AnomalyLow -> res * 0.6
| AnomalyHigh -> res * 1.8
| Normal -> res

The logic is the same as in C#, but the active pattern separates out the determination of case from the resulting action. In other words, we use the active pattern to tell if this is a low anomaly, a high anomaly, or a normal result and, separate from the process of making that determination, we return the appropriate output value.

Using a .NET Interactive notebook, I can easily generate some results:

let r = Random()
[1..10] |> Seq.map (fun x -> generateDoubleValue r 31.0 1.0 0.86)

In case you’re not familiar with this syntax, we use [1..10] to create a list with elements ranging from 1 to 10 inclusive. Then, for each element of the array, we map the value to a function—that is, we call a function for each array element. That function does take in the array value as an input, but we don’t actually use it for anything so I could replace that with _ to make it clear we don’t care about the inputs. Nonetheless, we call generateDoubleValue with a mean of 31.0, a standard deviation of 1.0, and a likelihood of anomaly of 0.86, just to make sure that we get plenty of anomalies.

## Conclusion

In this post, we took a look at two separate topics: generating occasional anomalies for testing and also active patterns in F#. One of the biggest benefits of building active patterns is that they’re reusable. Yeah, this particular one probably won’t be used a lot in code, but if you build a pattern to parse a regex or define if a customer owes too much or has an invoice which is too old, that’s something you might use in several parts of the code. Instead of building those if statements each time, you can use active patterns as a rules engine, define the condition once, and use that condition throughout your code. It does take some time wrapping your head around things, but if you’re getting into F#, it’s definitely worth the effort to learn.