Ethereal-dev: [ethereal-dev] Protocol Description Language

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Richard Sharpe <sharpe@xxxxxxxxxx>
Date: Sat, 31 Jul 1999 01:36:40 +0900
Hi,

after receiving some encouragement from Andrew to keep going on the
protocol description language project I am working on for generating
Ethereal decode modules for the SMB protocol and other things, I though I
would put down some thoughts and ask for ideas.

Currently I have a number of problems.  Some of these are parser problems
that can be fixed by building a cleaner parser.  That is they are simply a
matter of programming.

However, a big problem I currently have is with the language itself.

An example might show the problems.  Here is an examply SMB description:

SMB funny-smb { # An SMB is a clause
  andx;     # Which can contain an andx marker that trigers the parser to 
            # generate code to handle ANDXs
  request { # There can be one or more requests
    UCHAR Word Count (WCT) = 2;  
    USHORT Some Funny Field;
    BITFIELD 16 A Funny Field = {
      0x01 = { "This value not set" , "This value set" };
    };
    USHORT Byte Count (BCC);
    STRING A String Field;
  }
  response { # There can be one or more responses ...
    # etc ...
  }
}

Now, the first big problem that I see it that the language and parser are
tied to generating Ethereal decode modules.  I would like something a
little more general and am looking for ideas on this.

I think I need to separate declaration from actions.

Something that comes to mind is that each statement could have a
declaration part and an action part.  When combined with:

  Include files
  Functions?
  Type definitions

I could have a language that generates Ethereal decode modules as well as a
one that generates code to to a differential comparison of Samba with say
NT4.0 SP6 of NT 2000?

A master include file could specify the code generated by each action?

So, the syntax might look like:

proto SMB = {
unit funny-smb {
  UCHAR Word Count (WCT) [wct] %gen-ushort();
  # Which means there is a field of type USHORT and that the action
  # gen-ushort should be called with the parse tree element for this 
  # item to generate code to handle the field.
  UCHAR AndXCommand [andxc];
  # Declare this item for later use, ie, name the node in the parse tree
  UCHAR AndXReserved;  # Ignore this field
  USHORT AndXOffset [andxoff];
  request {
    %gen-if(wct, "=", 4);  # Generate code to compare wct to 4
    USHORT Field 1 %gen-ushort();
    USHORT Field 2 %gen-ushort();
    etc ...
    %end-gen-if();         # End of code to display one format
    %gen-if(wct, "=", 3);
    USHORT Field 1 %gen-ushort();
    USHORT Field 2 %gen-ushort();
    etc ...
    %end-gen-if();
    etc ...
  };
  response {
   USHORT Count [cnt] %gen-ushort(cnt);
    ... some fields
   REPEAT cnt of {  # Indicates repeats based on cnt
     USHORT Resp 1 %gen-ushort();
     USHORT Resp 2 %gen-ushort();
   };  
  };
  %gen-andx(andx-command);  # Generate code to do the andX
}

There are some other ideas lurking around ... What do you folks think?

Regards
-------
Richard Sharpe, sharpe@xxxxxxxxxx, NS Computer Software and Services P/L,
Samba (Team member www.samba.org), Ethereal (Team member www.zing.org)
Co-author, SAMS Teach Yourself Samba in 24 Hours