CS CODEDOM Parser is utility which parses the C# source code and
creates the CODEDOM tree of the code (general classes that represent code, part of .NET
Framework - namespace System.CodeDom) .
Current version (0.1) is limited - it parses code down to type members
and their parameters, it has very limited support for expressions and it
does not parses the statements inside members. There are two main reasons,
why I stayed now on this level now
- First - It was enough for my needs (I wanted to do some code analysis
to enforce coding standards
- Second - CODEDOM is limited and cannot express fully the C# code - for
more details see section CODEDOM Limitations below.
On the other hand it also parses source code comments, so it can be used
to analyze the interdependencies of code and comments.
Also the stability of this version is low - it's kind of alpha version.
If anybody wants to help get this thing further he is welcomed.
The parser is based on Mono - CSharp Compiler
code . I was looking around
little bit around for available C# parser and C# parser building tools (I
wanted C# parser in C#) and finally decided for Mono. For more details
about exploitation of Mono parser and other possibilities I explored
see section C# parser Tools.
At first I thought it is great idea to use language independent syntax tree
and CodeDom looks nice. If some code analysis tool is build on it, it can
work for any .NET language. Just need to change parser and rest is the same,
sounds cool. But, after I've got into the CodeDom, I have found that a lot of
language features (and not just C#, basically for any language) is missing
and it is not possible to parse the source code fully. The main problem is
in expressions and statements, where CodeDom has very limited set of classes
- there is for instance no support for unary operation and more more issues.
I decided to continue with CodeDom, even with its limitations, because it
was enough for purposes of analyzing code for coding standards (at least
what I need now - it also enables to keep comments and code in one tree,
which is something I liked), but it is
open issue for the future development.
Here is list of issues I've found (and there is more,):
CodeCompile unit does not have space for using directives or ns members, so
they are placed now into first default NS
using_alias_directive - no support found
nested namespaces - no support found ( so parser is flattening ns hierarchy)
variable declaration list (int i,j,k;) - no support - transformed to
individual var declarations
pointer_type - no support found
"jagged" array type (array of arrays) - MS CSharpCodeProvider reverses order
params keyword - not supported - param is omitted in parsing and param is
then an ordinary array type param
private modifier on nested delegate is not shown by CSharpCodeProvider (all
other nested types works fine)
unsafe modifier - no support found
readonly modifier - no support found
volatile modifier - no support found
explicit interface implementation - not implemented yet (I think this can be
add and remove accessors for Event - no support found
virtual and override modifiers do not work in MS CSharpCodeProvider for
Operator members and Destructors - no support found
Expressions - no unary expressions(operations) at all !!!, only one dim
arrays, some operators not supported and more
Attribute targets : no support found
Attributes on accessor : no support found
If CompileUnit contains custom attributes in global scope,
CSSharpCodeProvider prints then before global using directives (it is due to
that using has to be in the first ns)
I wanted to use some existing tool so I looked around and found this
interesting stuff :
- Mono project
They are implementing complete open source .NET platform (they modified
jay parser generator and used it to generate the parser).
Compiler Writing Tools using C#, from Malcolm
Crowe of the University of Paisley
Mr.Crowe creates parser and lexer generator in C#. I was playing with these
tools quite a bit, but when I wanted to do something bigger, I've got
C# grammar for
flex/bison written by James Power of National University of Ireland
Contains scripts for well-known tools bison and flex, which can generate C
parser. I thought I can use then in some C# port of those tools, but I was
not able, so finally used the grammar from Mono.
This is port of JB Parser and Lexer Generation for Java (which itself is port of
bison and flex). But the current version is alpha and I was not able to make
work even their calculator example (which authors claim it was
CsLex from Brad Merrill
It is a lexer generator.
I've also looked at the MS Rotor project, the C#
parser there is in C++ (and it is not Open Source license).
So finally I decided to use Mono source, I've used their lexer, jay and
their jay grammar to generate my parser. It the jay grammar I've use my code to
create CodeDom objects.
Description of package
CS CODEDOM Parser package consist of :
- CodeDom parser itself (/ directory)
- NUnit tests for the parser (/NUnitTests directory)
Contains bunch of tests, I've used to check functionality of the parser - if you want to run then you should
- testParser (/testParser directory)
Simple command line utility that tests the parser - it parses file
(name supplied as cmd line parameter) and write to stdout the code, which
is generated by CSharpCodeProvider (class in CodeDom).
- CodeTreeView (/CodeTreeView directory)
Simple windows application, which opens file and displays CODEDOM tree in
left part (treeview control) and original source in right part (textbox
control). When you click on tree node, textbox scrolls to show the code.
It is something like very very simple source code viewer.
CS CODEDOM Parser and tools included in this package are distributed
under the under
25.10.2002 - new updated version available, contains fixes to some bugs,
- Parsing of inner classes
- Parsing of event declaration, where there is several event in one
- Default access modifiers
- Type attributes for first type in source file, where there is no
- Comments parsing (now we trim EOL from line comments)
You can download source code here here.
Debug binaries are also part of the package. You can check for latest
version on http://www.sweb.cz/ivan.zderadicka/csparser.html.
CS CodeDom Parser is also available at sourceforge.net, where you can get
latest sources and also join the project as developers. Project page is at
The basic idea about future development is to extend CodeDom to support
all language features, so the sources can be completely parsed. (Alternative
is to leave CodeDom and have own syntax tree, but I still like the idea of
the independent language tree structure, which can be used in different
Reporting of errors and warnings should be improved (unify codes and
messages, unify error reporting, Report class should store reported errors).
Also parser should be improved to indicate location of syntax elements
more exactly in the source file.
Better separation between the parser and CODEDOM builder is also needed.
If somebody likes the tool and wants to help with its improvements, he is