Thursday 7 March 2013

Playing with Microsoft's binary encoding

A friend asked me to help him out with a project. He was performing a penetration test on a Silverlight application but had fallen at a fairly early hurdle - communications between the Silverlight front-end and the back-end web services was in a format that he didn't recognise. Firstly it was gzipped, that much was obvious, but once it had been uncompressed it appeared to be nonsense interspersed with bits of clear-text, but to me it looked suspiciously similar to something I'd seen before when I'd been tasked with reading messages out of an MSMQ dead-letter queue.

The messages on the MSMQ queue contained a WCF (Windows Communication Foundation) payload, wrapped in a bit of mystery padding. My task at the time was to decode the WCF message to extract certain pertinent bits of information so processing messages could be reconstituted and re-inserted onto the queue now that the data problems that made them wind up on the dead-letter queue in the first place had been resolved, but I digress.

Now when you're sending stuff across the wire, data packing efficiency is everything. The WCF message in this case was a SOAP message, but XML's very wordy and inefficient so Microsoft kindly, and by default, encode it in a proprietary format based on a system of dictionary substitutions; well-known XML elements are replaced with dictionary look-up references, and other data-types are efficiently packed in various cunning ways. There are some great articles out there on the tubes that describe the encoding in great detail, which carries the MIME type application/msbin1, but I was in a hurry and didn't have time to write my own decoder from first principles. There must be another way.

There is! I found a great article that described driving methods buried deep inside the System.ServiceModel namespace where mortals don't usually need to tread that, when given a binary-encoded message, spits out a System.ServiceModel.Channels.Message object, which is great and it worked for me in that case and actually, while writing this, I've come across another implementation of the same technique I used that's quite a bit better. You can find it here.

However, it's all a bit messy when the encoded payload isn't even a System.ServiceModel.Channels.Message in the first place, as was the case in the scenario I started this ridiculous story describing. I also ran into issues (probably resolvable) where my messages were already marked as read and it was a violation to read them twice, so I couldn't access the message body directly anyway and had to just call ToString(). Rubbish. Besides, we needed to be able to encode messages as well as decode them. Creating a System.ServiceModel.Channels.Message message out of thin air looked much too difficult for my very small brain. I'm only a sysadmin.

The general case for encoding and decoding arbitrary XML documents using Microsoft's binary encoding fell out of the desire to encode, which I ended up implementing like this:

using System.Xml;

public Stream GetBytes(Stream inputStream)
{
    MemoryStream ms = new MemoryStream(4096);

    var reader = XmlReader.Create(inputStream);

    var writer = System.Xml.XmlDictionaryWriter.CreateBinaryWriter(ms);
    writer.WriteNode(reader, true);
    writer.Flush();

    ms.Seek(0, SeekOrigin.Begin);

    return ms;
}

This gives you back a Stream full of nicely encoded content to do with as you wish. In our case it was then gzipped and POSTed to the service we were trying to exploit.

To decode, it's the inverse, except this time we want an XmlDocument back so we go and Load() an XmlDocument out of the XmlDictionaryReader we've created to do the decoding.

using System.Xml;

public XmlDocument GetXML(Stream inputStream)
{
    MemoryStream ms = new MemoryStream(4096);

    byte[] buffer = new byte[4096];
    int count = 0;
    while ((count = inputStream.Read(buffer, 0, buffer.Length)) > 0)
    {
        ms.Write(buffer, 0, count);
    }

    ms.Seek(0, SeekOrigin.Begin);

    var reader = XmlDictionaryReader.CreateBinaryReader(ms, XmlDictionaryReaderQuotas.Max);

    XmlDocument document = new XmlDocument();
    document.Load(reader);

    return document;
}

What's going to catch you out is that despite that the classes that do all the magic, XmlDictionaryReader and XmlDictionaryWriter, being in the System.Xml namespace, they're actually contained inside the System.Runtime.Serialization assembly, not System.Xml as you'd expect.

And there we had the makings for arbitrarily manipulating payloads for the target application which, as it turned out, had data-validation holes you could drive a truck through.

It's actually pretty simple to do this work in Powershell as well. If anyone cares, leave a comment.

No comments: