Until about a week ago Mono’s Monitor implementation, which is used for synchronized methods and the lock statement, was comparatively slow. Each Monitor.Enter and Monitor.Exit called into unmanaged code, which is very inefficient compared to a normal method call. We improved this situation by implementing fast-paths that can be called cheaply and that can handle most cases.
The fast-paths come in two varieties: Portable fast-paths, implemented in CIL code, and native fast-paths, in hand-optimized assembler code. We only have assembler fast-paths for x86 and AMD64, as those are the most-used architectures that Mono runs on.
I did some micro-benchmarking to compare the fast-paths with the old unoptimized implementation. Each of the tests does 100 million calls to synchronized methods. Two of the tests call static methods, the other two non-static ones, and two of the tests call empty non-recursive methods whereas the other two call recursive ones.
Here are the results for my x86 machine (2.16 GHz Core Duo):
The blue bars represent our old implementation. The orange bars stand for the IL fast-paths and the yellow bars for the assembler fast-paths. The green bars represent the run-times for non-synchronized methods.
The results for my AMD64 machine (1.86 GHz Core 2) look quite similar:
To finish things off I also ran the benchmark on Microsoft .NET on my x86 machine. Here is the comparison to Mono with the assembler fastpaths: