Thursday, December 4, 2014

The quest for a formal language

As I started to indicate in the previous post, the project I am working demands a formal language of its own. Before I lose my mind implementing in a generic yacc, I need to play with it to have an idea of what I need. Thus, for the time being I am just coding a long Python function that takes a grammar (a dictionary of rules) and generates the text.

(there is an XKCD for everything)

In terms of formal languages and parsers, I never really did more than playing. There probably is some language out there the solves the problem, but (to keep with programmer-related clichés) it is always funny to reinvent the wheel.

So, this is the syntax the polished version is accepting so far (all passing my hand-written units tests!). With a good thesaurus, you can already be dangerous. ;)


Feature Example Acceptable results for "@test@" (one per line)
Basic literal test: "abc" abc
Sequence test: "a::b::c" a
b
c
In-line sequence test: "[a|b|c]" a
b
c
Literal function test: "@letter@"
letter: "a::b::c"
a
b
c
Declaration test: "@l<<letter@ and @l@"
letter: "a::b::c"
a and a
b and b
c and c
Post-processing on literal test: "text->capital" Text
Post-processing on sequence elements test: "a::b->upper" a
B
Post-processing on in-line sequence elements test: "[A->lower|B] a
B

The post-processing functions already defined are lower, upper, title, and capital.

No comments:

Post a Comment