Saturday, July 7, 2012

Caching XSD Schemas in Your .NET Application

If your .NET application deals with linked XSD files, you might find it very inefficient to re-load the linked XSDs every time the application starts. One of the real world example of this is the XBRL taxonomy schemas, which is linked to numerous XSDs defining different parts of the XBRL standard. Now, how do you cache the XSDs schemas locally? For the caching to work, you need to use .NET 4.0 because the caching-related classes is only available in .NET 4.0. Now, let me show you how I make the caching works in Gepsio (one of the open source XBRL implementation):

  1. Goto the project (JeffFerguson.Gepsio project) and change the "Target Framework" to ".NET Framework 4".
  2. Add the following code to XbrlSchema.cs file: 
  3. using System.Net.Cache; // This line added for caching support
    
    //...
    
    public class XbrlSchema
        {
    //...
    private XmlUrlResolver thisXmlUrlResolver;
    
    //...
     internal XbrlSchema(XbrlFragment ContainingXbrlFragment, 
                         string SchemaFilename, string BaseDirectory)
         {
             thisContainingXbrlFragment = ContainingXbrlFragment;
             this.Path = GetFullSchemaPath(SchemaFilename, BaseDirectory);
    
             try
      {
        var schemaReader = XmlTextReader.Create(this.Path);
        thisXmlSchema = XmlSchema.Read(schemaReader, null);
        thisXmlSchemaSet = new XmlSchemaSet();
    
    ///---- START caching with XmlUrlResolver
               thisXmlUrlResolver = new XmlUrlResolver();
               thisXmlUrlResolver.CachePolicy = new  
                         RequestCachePolicy(RequestCacheLevel.CacheIfAvailable);
               thisXmlSchemaSet.XmlResolver = thisXmlUrlResolver;
    ///----- END caching with XmlUrlResolver 
    
               thisXmlSchemaSet.Add(thisXmlSchema); 
        thisXmlSchemaSet.Compile();
    
    //...
    
Well, actually I found the call to the XmlSchemaSet object (thisXmlSchemaSet) to be particularly the biggest bottleneck with Visual Studio profiler when dealing with lots of XBRL instance files because XmlSchemaSet always download the XSD dependencies from xbrl.org.

A side note: the CachePolicy member of the XmlUrlResolver class is not available in .NET Framework version < 4.0 . That's the reason why you have to switch the project to .NET Framework 4.0.

Perhaps, you're not aware of this: the call to XmlSchemaSet.Add() will traverse every linked XSD schemas   until all of them are resolved. Now, if the location of the schemas are remote, that incurs very high penalty to the speed of your application and that wastes unnecessary bandwidth. In this case, schema caching comes to the rescue.
Post a Comment

No comments: