[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Protocol Specification in Lua
- From: "Robert G. Jakabosky" <bobby@...>
- Date: Fri, 1 Oct 2010 13:40:11 -0700
On Friday 01, John Passaniti wrote:
> I'm looking for a little design inspiration from others. I have a
> basic approach (which I'll describe, below), but chances are good that
> someone here has already implemented something like this or can see a
> better solution.
>
> I have a UDP-based communications protocol for controlling a device.
> The protocol is implemented in C and specified in a tedious Microsoft
> Word document. The base protocol is simple and extensible and isn't
> interesting. What is interesting is there are a large number of
> binary messages, each with a unique structure. I'm bothered that when
> I change the C code to add a new message, I also have to update the
> Word document. The two are separate things.
>
> I'd like to write a tool that could parse these messages and display
> them in human readable form. And I'd like another tool that does the
> reverse-- specify a message in a human-readable form, and get back out
> a message. And I'd like to be able to write a Wireshark dissector for
> these messages. And I'd like... well, more stuff.
>
> The solution would seem to be to express the protocol in some
> abstract, high-level representation (like... a Lua table). Once in
> this form, I can write a tool that will dump out a description of the
> protocol in HTML. I could also have it generate optimized C code to
> implement the protocol parser (such as converting to a finite state
> machine). The table would also be at the heart of the Wireshark
> dissector and the tools to display and create packets. And so on.
>
> Now whenever I make changes to the protocol, I edit the Lua table that
> describes the messages, and I'm done. Everything flows from that.
>
> So how about something like this:
>
> {
> OPT_MUTE = {
> option = 0x02,
> read = true,
> write = true,
> message = {
> {
> name = "type",
> type = OutputInput,
> desc = "Channel Type"
> },
> {
> name = "number"
> type = ChannelNumber,
> desc = "Channel Number"
> },
> {
> name = "mute"
> type = OffOn,
> desc = "Mute Status"
> }
> }
> }
> }
>
> So I have a message named OPT_MUTE that has a binary value of 0x02,
> it's both readable and writable, and has three items in the payload.
> The three items are named type, number, and mute and are of specific
> types (OutputInput, ChannelNumber, and OffOn). Each of these types is
> itself a table that describes them further:
>
> ChannelNumber = {
> bytes = 1,
> low = 0,
> high = 23,
> }
>
> There are (potentially) a variety of other fields to describe messages
> (which devices implement them, numeric ranges), and the payload
> section can have repetitions of a item or a larger structure, or
> optional data at the end of a message.
>
> What I'm looking for is if anyone has done this kind of thing before
> and what kind of representation you came up with. The biggest
> stumbling block I have is the representation of repeated and optional
> data in the payload.
>
> Suggestions?
Take a look at how the XCB [1] (X protocol C-language Binding) project
generates it's C code for parsing the wire protocol. They describe the
protocol in XML [2] (not pretty but you can take some ideas from it). The
XML format is explained here [3].
Here are a few more protocol languages Avro [4], Protocol Buffers [5], Thrift
[6], Etch [7].
Also here is a message based protocol [8] that is used over UDP, I even wrote
a wireshark protocol dissector for it in Lua [9].
I have done a lot of research into this subject over the past year and even
created my own protocol language where the definitions are written as Lua
code (see the attached test_records.lua file for an example). I wrote a code
generator that takes a definition file and generates C code to handle
encoding & decoding those records into binary data. The main reason I used
Lua code for the definition files is that I didn't want to create a full
parser for a new definition language that would be changing a lot as I worked
on it. I can describe it in more detail if you are interested.
I also used parts of that code generator in a Lua bindings generator for all
the C code in my main project. You can see an example bindings definition in
the attached gd_module.lua file. The bindings generated from gd_module.lua
would be used like this:
require("gd")
local img = gd.gdImage(200,200) -- create image 200x200
local red = img:color_allocate(255,0,0) -- create red color
img:line(10,10,50,50, red) -- draw red line
img:toPNG("line.png") -- output png image
I have been thinking about releasing the bindings generator. Right now I call
it "Lua API Gen", but I am not sure if that is a good name or not. If anyone
is interested in it let me know.
1. http://xcb.freedesktop.org/
2. http://cgit.freedesktop.org/xcb/proto/tree/src/xproto.xml
3. http://cgit.freedesktop.org/xcb/proto/tree/doc/xml-xcb.txt
4. http://avro.apache.org/docs/current/
5. http://code.google.com/p/protobuf/
6. http://incubator.apache.org/thrift/
7. https://cwiki.apache.org/ETCH/index.html
8. http://wiki.secondlife.com/wiki/Message_Layout
9. http://opensimulator.org/wiki/LLUDP_Dissector
10. http://library.gnome.org/devel/glib/stable/
--
Robert G. Jakabosky
package "TestRecs" "ntest" {
-- test enum.
enum "TestEnum" {
TYPE_INVALID = 0,
TYPE_BOOL = 1,
TYPE_INT8 = 2,
TYPE_INT16 = 3,
TYPE_INT32 = 4,
TYPE_INT64 = 5,
TYPE_VAR32 = 6,
TYPE_VAR64 = 7,
TYPE_UINT8 = 8,
TYPE_UINT16 = 9,
TYPE_UINT32 = 10,
TYPE_UINT64 = 11,
TYPE_UVAR32 = 12,
TYPE_UVAR64 = 13,
TYPE_FLOAT = 14,
TYPE_DOUBLE = 15,
},
-- test records
record "TestRecord" {
version { 0, 1 },
-- sub-type structure
struct "TestBlock1" {
field "uint32" "Test1" { default = 543 },
},
struct "NeighborBlock" {
field "uint32" "Test0" { default = 0 },
field "uint32" "Test1" { default = 1 },
field "uint32" "Test2" { default = 2 },
},
-- example of usage of enum type.
field "TestEnum" "enum_val" {},
field "TestEnum" "enum_val_int32" { default = 'TYPE_INT32' },
-- bool fields are packed together at the start of the record.
field "bool" "bool_val",
field "bool" "bool_val_true" { default = true },
-- fixed length signed integers of length 8/16/32/64-bits
field "int8" "int8_val" { default = -21 },
field "int16" "int16_val" { default = -4321 },
field "int32" "int32_val" { default = -4321 },
field "int64" "int64_val" { default = -4321 },
-- variable length signed 32-bit & 64-bit integers (encoded like protobufs)
field "var32" "var32_val" { default = -4321 },
field "var64" "var64_val" { default = -4321 },
-- fixed length unsigned integers of length 8/16/32/64-bits
field "uint8" "uint8_val" { default = 21 },
field "uint16" "uint16_val" { default = 4321 },
field "uint32" "uint32_val" { default = 4321 },
field "uint64" "uint64_val" { default = 4321 },
-- variable length unsigned 32-bit & 64-bit integers (encoded like protobufs)
field "uvar32" "uvar32_val" { default = 4321 },
field "uvar64" "uvar64_val" { default = 4321 },
field "float" "float_val" { default = 4.321 },
field "double" "double_val" { default = 4.321 },
-- strings are encoded with the string length first encoded as an uvar32 type.
field "string" "string_val" { default = "this is a test" },
-- embedded TestBlock1 struct (i.e. this is the same as a fixed length array
-- of TestBlock1 structs with length 1)
field "TestBlock1" "testblock1" {},
-- Arrays can be fixed or variable length. The length of variable length arrays are
-- encoded as an uvar32 type.
-- fixed length array of 4 NeighborBlock structs
array "NeighborBlock" "neighborblock" { 4 },
-- variable length array of NeighborBlock structs
array "NeighborBlock" "neighborblock_var" {},
-- fixed length array of 4 bools (encoded as 4 bits, padded to 8 bits)
array "bool" "bitarray_4" { 4 },
-- variable length array of bools (padded to a multiple of 8 bits)
array "bool" "bitarray_var" {},
-- fixed length array of 4 strings
array "string" "strarray_4" { 4 },
-- variable length array of strings.
array "string" "strarray_var" {},
}
}
c_module "gd" {
hide_meta_info = true,
include "gd.h",
object "gdImage" {
method_new {
c_call "gdImage *" "gdImageCreate" { "int", "sx", "int", "sy" }
},
method_delete {
c_call "void" "gdImageDestroy" {}
},
method "color_allocate" {
c_call "int" "gdImageColorAllocate"
{ "int", "r", "int", "g", "int", "b" }
},
method "line" {
c_call "void" "gdImageLine"
{ "int", "x1", "int", "y1", "int", "x2", "int", "y2", "int", "colour" }
},
method "toPNG" {
var_in { "const char *", "name" },
c_source [[
FILE *pngout = fopen( ${name}, "wb");
gdImagePng(${this}, pngout);
fclose(pngout);
]]
},
}
}