lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


ScriptX was a Dylan-like language that had a simple more traditional syntax (kind of like C, but with a few quirks and differences). It had a parser that created parse trees (abstract syntax trees), a macro system that operated on parse trees, and a compiler that compiled the parse trees into byte code. Dan Borenstein wrote an alternative parser for ScriptX: a scheme syntax front-end for ScriptX (which was very similar to Scheme internally).

Dylan also has both a lispy s-expression-istic and an alternative algol-holic syntax:

http://en.wikipedia.org/wiki/Dylan_programming_language

Python takes a similar approach, but the parse trees that the parser produced were very low level and did not make a lot of sense for macros to process (it included many extremely uninteresting intermediate nodes in the BNF that the parser had to travel through before it got to the part of the syntax of each expression, has many special cases, and dealing with it requires a lot of interpretation and simplification). But the latest version of Python has a totally new rewritten AST parser that I hope is more macro-programmer-friendly! I haven't had a chance to look at the code or play around with it yet, but it sounds like quite a useful improvement.

Access to the AST is useful not only for compiling an alternative text syntax, but also for a visual programming language! It makes it easier to parse a text expression and edit it visually, and compile visual code directly without going through the text syntax.

   -Don

http://www.python.org/download/releases/2.5/NEWS.txt

- A new AST parser implementation was completed. The abstract
 syntax tree is available for read-only (non-compile) access
 to Python code; an _ast module was added.

http://docs.python.org/dev/whatsnew/node16.html

The design of the bytecode compiler has changed a great deal, to no longer generate bytecode by traversing the parse tree. Instead the parse tree is converted to an abstract syntax tree (or AST), and it is the abstract syntax tree that's traversed to produce the bytecode.

It's possible for Python code to obtain AST objects by using the compile() built-in and specifying |_ast.PyCF_ONLY_AST| as the value of the flags parameter:

from _ast import PyCF_ONLY_AST
ast = compile("""a=0
for i in range(10):
   a += i
""", "<string>", 'exec', PyCF_ONLY_AST)

assignment = ast.body[0]

for_loop = ast.body[1]

No documentation has been written for the AST code yet. To start learning about it, read the definition of the various AST nodes in Parser/Python.asdl. A Python script reads this file and generates a set of C structure definitions in Include/Python-ast.h. The PyParser_ASTFromString() and PyParser_ASTFromFile(), defined in Include/pythonrun.h, take Python source as input and return the root of an AST representing the contents. This AST can then be turned into a code object by PyAST_Compile(). For more information, read the source code, and then ask questions on python-dev.

The AST code was developed under Jeremy Hylton's management, and implemented by (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil Schemenauer, plus the participants in a number of AST sprints at conferences such as PyCon.

http://www.amk.ca/diary/2005/10/the_ast_branch_lands.html

The AST branch lands
Last week, a historic event happened in the Python source tree: the AST branch was merged into the trunk. I've only followed this work from python-dev e-mails, and with only half my brain, so take the following notes with a grain of salt; I've probably made some silly errors.

To start, what's the AST branch? In all previous versions of CPython, Python source code is parsed and turned into a parse tree. Code then loops over this parse tree to generate Python bytecode. The problem with this design is that it's very difficult to make modifications or analyses of the parse tree, because the parse tree records details that aren't relevant from the point of view of the code's semantics. For example, the parse trees for these two statements are different, even though they implement the same computation: [...]

AST stands for Abstract Syntax Tree. The AST more closely matches the semantics of Python and cleans up special cases. [...]

The primary benefit is that it should now be easier to write optimization passes and tools such as PyChecker or refactoring browsers, particularly once there's a Python interface to the AST and once everything is documented. There's also some hope that the AST interface can be shared across Python implementations such as Jython and IronPython. We'll see where things go from here.