Parsing JSON into dynamic objects using MGrammar and C# 4.0  

Thursday, February 11 2010

I gave a talk last week at the New Jersey .NET User Meetup Group on designing textual DSLs using MGrammar. You can download the slides here: Textual DSLs with MGrammar. The code portion of the project was focused around developing a JSON parser into dynamic objects in C# 4.0. JSON has become an increasingly popular data format over the last several years for it’s flexibility and simplicity. The entire JSON specification can be summarized in just five words: objects, arrays, values, numbers, and strings. Each of these diagrams has a clear representation in MGrammar shown below:

Objects

json-object


syntax Object 
  = ObjectStart first:KeyValuePair rest:ObjectPart* ObjectEnd => Object { Pairs { first, valuesof(rest) } };

syntax ObjectPart
  = Comma pair:KeyValuePair => pair;

syntax KeyValuePair
  = key:String Colin value:Value => Pair { Key { key }, Value { value } };  

Arrays

json-array


syntax Array
  = ArrayStart first:Value rest:ArrayPart* ArrayEnd => Array [ first, valuesof(rest) ];

syntax ArrayPart
  = Comma value:Value => value;  

Values

json-value


syntax Value 
  = string:String => string
  | number:Number => Number { number }
  | object:Object => object
  | array:Array => array
  | primitive:Primitive => Primitive { primitive };

Numbers

json-number


token Number = Minus? Digit+ ('.' Digit+)? Exponent?;
token Exponent = ("e" | "E") Sign? Digit+;
token Digit = "0" .. "9";
token Sign = Plus | Minus;
token Minus = "-";
token Plus = "+";    

Strings

json-string


syntax String = '"' text:(text:StringText | empty) '"' => String { valuesof(text) } ;
token StringText = !('\u0022')+;    

Inside the Parser

Much of the actual parsing is done using simple LINQ expressions. First through, you have to install Visual Studio 2010 and the SQL Modeling November 2009 CTP (formerly “Oslo”) and created a new “Oslo Library.” This is important because it sets up all the M and MGrammar dependencies for you.

After you’ve created your Oslo Library, you must add your *.mg file and set the build action to “MCompile.” This will use the M tools to compile the *.mg file into a *.mx image. After that’s completed we can create our parser at runtime:


MImage grammar = new MImage(@"jsonm.mx");
grammarParser = grammar.ParserFactories["jsonm.jsonm"].Create();
grammarParser.GraphBuilder = new NodeGraphBuilder();

Now that we have our parser we can use that to parse the raw text into MGraph:


var inputText = new StringTextStream(sourceText);
var errorReporter = new ParserErrorReporter();
Node rootNode = (Node)grammarParser.Parse(inputText, errorReporter);
JsonmObject obj = ParseObject(rootNode);
return obj;

The rest of the process uses a series of LINQ expressions using some extension methods (included with the project). Here’s an example of how objects are parsed:

private static JsonmObject ParseObject(Node objectNode) { JsonmObject jsonmObject = new JsonmObject();

List<Tuple<string, object>> keyValuePairs = objectNode
    .Edges.FindNodeWithBrand("Pairs")
    .Edges.FindNodesWithBrand("Pair")
    .Select(node => ParseKeyValuePair(node))
    .ToList();

foreach (Tuple<string, object> pair in keyValuePairs)
{
    jsonmObject.TrySetMember(
        new DynamicDictionaryMemberBinder(pair.Item1, false),
        pair.Item2);
}

return jsonmObject;
}

You can see in the above example that in addition to parsing the MGraph, it sets members on our dynamic dictionary. This is how we can parse some sample JSON like this:


{
    "Object" : 
    {
        "String" : "StringValue",
        "Number" : -1.4e1,
        "null"   : null,
        "true"   : true,
        "false"  : false
    },
    "Array"  : [ "StringValue", 42, false, true, null ],
    "String" : "StringValue",
    "Empty"  : "",
    "Number" : -1.4e1,
    "null"   : null,
    "true"   : true,
    "false"  : false
}

Into a dynamic object using our parser like so:


dynamic result = JsonmParser.Parse(new Uri(Environment.CurrentDirectory + @"\sample.json"));
Console.Out.WriteLine(result.Object,String); // ==> "StringValue"

As always this code is on GitHub. I look forward to see more parsers in MGrammar in the future, send any MGrammar hacks my way in email or comments!


  • Posted by Charlie Robbins

Post a comment


(required, but not displayed)

(optional)