A Memory Leak brought to you by XmlSerializer

Oct. 26, 2011, 6:16 p.m.

We ran into an interesting memory leak the other day. When one of my colleagues analyzed the memory dump from production, he saw an unusual amount of types in *.GeneratedAssembly.* namespaces, one instance each:

0:007> !dumpheap -stat
              MT    Count    TotalSize Class Name
000007ff0024da98        1           24 System.Xml.Serialization.TempAssemblyCache
000007ff002474f8        1           24 System.Xml.Serialization.Configuration.RootedPathValidator
000007ff00441d98        1           40 Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializerContract
000007ff004412c8        1           40 Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializerContract
000007ff004407f8        1           40 Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializerContract
000007ff003ef9b0        1           40 Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializerContract
000007ff003eeee0        1           40 Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializerContract
... and more

He also found the method XmlSerializer.GenerateTempAssembly in call stacks and soon thereafter the explanation online:

To increase performance, the XML serialization infrastructure dynamically generates assemblies to serialize and deserialize specified types. The infrastructure finds and reuses those assemblies. This behavior occurs only when using the following constructors:


XmlSerializer.XmlSerializer(Type, String)

If you use any of the other constructors, multiple versions of the same assembly are generated and never unloaded, which results in a memory leak and poor performance. The easiest solution is to use one of the previously mentioned two constructors. Otherwise, you must cache the assemblies in a Hashtable, as shown in the following example.

from MSDN

Turned out we were (indirectly) using one of the XmlSerializer constructors that cause this documented memory leak. This also popped up in many other places online and can be easily reproduced with a few lines of code:

public class Brick { }

    var s = new XmlSerializer(typeof(Brick), new Type[] {});
    s.Serialize(Stream.Null, new Brick { });

Isn't that some very fine memory leaking code? Not only is it eating up memory, it is also generating, loading (and deleting) *.dll files in your temp-path (e.g. C:\Users\josef\AppData\Local\Temp) and is therefore incredibly slow. The Hanselman already taught us how we can make the generated code visible:

<?xml version="1.0" encoding="utf-8" ?>
         <add name="XmlSerialization.Compilation" value="1" />

With this switch set to 1, the C# code for each generated assembly will be written into *.cs files so if you run the code above, your temp-dir will be flooded:

C:\Users\josef\AppData\Local\Temp>dir *.cs
 Volume in drive C has no label.
 Volume Serial Number is ####-####

 Directory of C:\Users\josef\AppData\Local\Temp

10/26/2011  05:11 PM             9,260 0r2upaee.0.cs
10/26/2011  05:11 PM             9,260 0wvlt0sz.0.cs
10/26/2011  05:11 PM             9,260 12x20ris.0.cs
10/26/2011  05:11 PM             9,260 1bsq04ge.0.cs
10/26/2011  05:11 PM             9,260 1wvkuolg.0.cs
10/26/2011  05:11 PM             9,260 1y4yyetd.0.cs
10/26/2011  05:11 PM             9,260 20yixjrc.0.cs
10/26/2011  05:11 PM             9,243 23qz4yru.0.cs
10/26/2011  05:11 PM             9,260 2vt3hznp.0.cs
         101 File(s)        926,038 bytes
           0 Dir(s)  91,353,591,808 bytes free

The recommended way to get around this issue is to use one of the two safe constructors (see above) if you can, or cache the XmlSerializer instances by yourself. We went for the latter.