Feature image

The Making Of Sarcasm (1) - Design Goals And Grammar

Introduction

This is not a tutorial on how to use Irony.net. When I am done with this series of articles, hopefully we will never need to deal with Irony directly ever again.

In case you didn’t know what Irony is, here is the introduction on its official site:

Irony is a development kit for implementing languages on .NET platform. Unlike most existing yacc/lex-style solutions Irony does not employ any scanner or parser code generation from grammar specifications written in a specialized meta-language. In Irony the target language grammar is coded directly in c# using operator overloading to express grammar constructs. Irony’s scanner and parser modules use the grammar encoded as c# class to control the parsing process.

Looks fantastic. However, after I tried for days to implement CoffeeScript grammar with it, I encountered some issues:

  • While constructing grammar with C# directly sounds cool, the syntax is just not as clean and efficient as a special design DSL would be.
  • There are absolutely no compile-time checking on grammar. You have to compile it into dll first, then load it with Irony.GrammarExplorer.
  • It is extremely hard, if not impossible, to track any grammar errors back to source code.
  • On top of that, debug information on Shift-Reduce and Reduce-Reduce conflict is almost unreadable for a complex grammar.

It’s a nice concept with poor tooling, which makes it scale poorly as the complexity of grammar grows. After some painstaking efforts to make my CoffeeScript parser to work, I finally begin to do something about it. I decide to create:

Sarcasm, an EBNF-like DSL that generates Irony.

The design goals are to:

  • Implement a DSL that allow developers to define grammar in a more clean and efficient syntax that looks very much like EBNF notation.
  • Generate Irony grammar implementation (in C#) and a nice formatted grammar specification document (in MarkDown)
  • Enable compile-time error checking and grammar validation
  • Trace any errors back to the source code
  • Improve the readability of debug information for grammar conflicts
  • Provide necessary Visual Studio languages services, templates and tools

Sarcasm Workflow

  1. Developer writes grammar specification file (.sarc)
  2. Compiler checks for syntax error and generates both Irony grammar class (in C#) and spec docs (in MarkDown)
  3. VS continues build process
  4. If build failed, Sarcasm tools filters though all error messages, and map related errors back to specific tokens in .sarc file.
  5. If build succeeded,Sarcasm toolsloads the assembly and validates grammar.
  6. Sarcasm toolstranslates any grammar conflicts, errors into a readable format and trace back to specific rule in .sarc file.
    The entire workflow should be seamlessly integrated with Visual Studio.

Sarcasm Grammar

In a nutshell, the Sarcasm grammar is a hybrid of MakeDown and modified EBNF notation. Here’s a quick snippet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# H1

/*
Block comment
*/

// Single Line Comment

// Directive
@class SarcasmGrammar

## H2

// Declarations
ID = new IdentifierTerminal("ID");
STRING = new StringLiteral("STRING", "\"", StringOptions.AllowsAllEscapes);

// Production Rules
SimpleValue := STRING | ID;

// Repeat
Ids := ID{};
Ids := ID*;
Ids := ID?;
Ids := ID+;

// Repeat with delimiters
Ids := ID{","};
Ids := ID*(".");
Ids := ID+(",");
### H3

As you can see, the grammar consists of:

  • MarkDown headers (start with one or more #). Directly used for outlining.
  • Comments (single line and block). All other text contents go into comments. MarkDown syntax can be used in comments.
  • Directives (starting with @). Configures compiler behaviors like generated class names.
  • Declarations. Declare and initialize grammar terminals.
  • Production rules. Specifies the grammar rules.
    I won’t go into full details here. But you can see for yourself:

Here is the full grammar of Sarcasm writing in Sarcasm.

And here is the MarkDownspecification documentation generated from that file

While theprojectis still in early developing stage, the grammar is mostly completed. I should be able to bootstrap it in a day or two.