Saturday, December 5, 2015

Benchmarking C# Code Times 2.0

Getting the codez:
A couple years ago, I blogged about a little bencharking tool I threw together. I recently made some changes to make it easier to use, and to improve the accuracy when timing small, fast bits of code. I also set up a GitHub project and NuGet package for it. The goal for this project is to provide a small, simple, and fast tool for timing bits of code, that's accurate enough to do micro-benchmarking of operations that have very small run times as well as longer running code.
Benchmarking is done through the Bench class. It has two methods for timing an Action delegate. The Bench class pre-runs the delegate code so it's not run cold, then does some preliminary timing runs to determine how many iterations it should run in the final timing test. It runs the timing test on the target, then times an empty Action delegate the same number of iterations so it can compute the run time of target, excluding the overhead of the timing loop.
When creating the Bench class you can specify the minimum number of iterations, and the minimum duration of the test, and a flag indicating whether you want the timing results written to the Console.. If you don't specify them, a default of 3 iterations, 100 milliseconds, and true for Console output is used.
    var bench1 = new Bench();
    var bench2 = new Bench(20, 1000, false);
The biggest change is that users of the Bench class no longer have to bother with including the looping code in their Action delegate. The repetitive iterations are now completely handled internally by the Bench class. If you're timing a bit of code that has a run time of under 50 nanoseconds per operation, you can increase accuracy of the timing by repeating the code 2 to 10 times so its run time is not obscured by timing overhead.
There are two Time methods for doing the timing. They both take three parameters - a string that names the test, an Action delegate, and an optional parameter that specifies the number of times the operation being tested is repeated in the Action delegate (the default value of this is 1). The difference between the two methods is one takes an Action delegate with no parameters, and the other takes an Action delegate with a single TimeControl parameter that allows some additional control of the timing. The following code shows examples of the simpler Time signature, which is the one most commonly used.
    var bench = new Bench();

    // time Sleep(10)
    bench.Time("Sleep", () => { System.Threading.Thread.Sleep(10); });
            
    // time string concatenation
    string s1 = DateTime.UtcNow.ToString();
    string s2 = DateTime.UtcNow.ToString();
    string s;
    bench().Time("String Concat", () => {
        s = s1 + s2;
    });

    // time int division
    int i0 = 0;
    bench.Time("int division by 5", () => {
        i0 /= 5; i0 /= 5; i0 /= 5; i0 /= 5; i0 /= 5;
        i0 /= 5; i0 /= 5; i0 /= 5; i0 /= 5; i0 /= 5;
    }, 10);
The code above outputs the following to the Console:
Name= Sleep
Millisecs/Op= 10.291, Ops= 10, ElapsedMillisecs= 102.91
Name= String Concat
Nanosecs/Op= 59.574, Ops= 1,257,008, ElapsedMillisecs= 74.89
Name= int division by 5
Nanosecs/Op= 5.777, Ops= 16,337,850, ElapsedMillisecs= 94.38

The second Time signature that takes an Action delegate expecting a TimeControl parameter allows timing of more complex operations that contain setup, or other code whose time you want excluded from the timing results. The following code shows two examples of that, one using a lambda, and the other a method.
    static void Main(string[] args) {
        var bench = new Bench();

        // time Dictionary Remove
        const int DictCount = 100;

        bench.Time("Dictionary.Remove(lambda)", (tc) => {
            tc.Pause();
            var d = new System.Collections.Generic.Dictionary<int, string>(DictCount);
            for (int i = 0; i < DictCount; i++) d.Add(i, i.ToString());
            tc.Resume();
            for (int i = 0; i < DictCount; i++) d.Remove(i);
        }, DictCount);

        bench.Time("Dictionary.Remove(method)", DictionaryRemove, DictCount);
    }
    public static void DictionaryRemove(Bench.TimeControl tc) {
        const int DictCount = 100;
        tc.Pause();
        var d = new System.Collections.Generic.Dictionary<int, string>(DictCount);
        for (int i = 0; i < DictCount; i++) d.Add(i, i.ToString());
        tc.Resume();
        for (int i = 0; i < DictCount; i++) d.Remove(i);
    }
The code above outputs the following to the Console:
Name= Dictionary.Remove(lambda)
Nanosecs/Op= 25.234, Ops= 3,948,900, ElapsedMillisecs= 99.65
Name= Dictionary.Remove(method)
Nanosecs/Op= 25.269, Ops= 3,856,300, ElapsedMillisecs= 97.45

The Time method returns a TimeReturn objet that contains the results of the timing, and properties for examining those results. If you're not just relying on seeing the results that Time writes to the Console, you can make use of the TimeResult object.

I should mention the normal caveats for casual benchmarking. This utility uses StopWatch, which measures clock time, not CPU time. This is an important point, because your results will vary some from run to run because of the other things going on in Windows and in the .Net runtime. If you need to do serious CPU time profiling, it’s better to just go to the commercial profiling tools. I tend to use both. I use the profiler to identify where I need to optimize, and use casual benchmarking while optimizing a chunk of code, to quickly see the effect of my changes. You should also keep in mind the possible effects of optimizations made by the compiler and just-in-time compiler, which may optimize away some of the code you are trying to benchmark. You also should run your timing with a release build, optimizations enabled, and without the debugger attached. If you happen to run with the debugger attached, or with with optimizations disabled (an indication you're running a debug build), the Time method will detect that and include alerts in the Console output to point out that fact.

The Bench class also includes methods to tell you how much memory is consumed by objects or structs you pass to it. That's described in this earlier blog post. Bench is available as a NuGet package, and the source code is hosted at GitHub.