Log in - Edit - History

Plof Grammar

Ok, hold on to your horses, because a lot will be introduced in this tutorial. First, let's take a look at a notation of specifying grammar called Backus-Naur Form (BNF). BNF is perhaps the most logical way to define a grammar, and the syntax for alteration of Plof grammar is very similar to BNF. BNF is composed of a series of facts (well, actually, productions). For each fact, a nonterminal is defined in terms of terminals and other nonterminals. For those of you who have no idea of what I'm talking about, a nonterminal expression expression is an abstract name, similar to a variable. A terminal expression is a concrete value (a literal). Let's look at a simple example of how BNF works, defining the grammar of a letter:

< letter > ::= < greeting > < body > < footer >
< greeting > ::= "Dear " < text > "," < newlines > | "To " < text > ":" < newlines > | < text > ":" < newlines >
< text > ::= < notnewline > < text > | < notnewline >
< newlines > ::= "\n" < newlines > | "\n" | ""
< body > ::= < text > < newlines >
< footer > ::= "From, " < text > | "Love, " < text >

Nonterminal expressions are placed inside < and > (note that BNF doesn't require that the name of a nonterminal expression have a space before and after -- the Plof wiki wasn't showing the text inside unless I added the spaces). ::= defines the nonterminal by whatever is on the right. "|" is basically "or", meaning that the nonterminal expression can be equal to either. This BNF example also uses recursion, which is highly common in BNF; for example, < text > is defined by < notnewline > and < text > or just < notnewline >. So the parser keeps searching until a newline is found. By the way, < notnewline > hasn't been defined here -- that's just because I'm lazy. In reality it would be defined as < notnewline > ::= "a" | "b" | "c" ...

So now you've seen a glimpse of how BNF works. If you're interested, you can see BNF's grammar defined in BNF itself at Wikipedia. Now let's relate all this to Plof. Grammar alteration is always put inside a grammar block:

grammar  {
  ...
}

like so. All productions (which are similar to BNF's) are put inside a grammar block. The main difference between Plof and BNF is that Plof uses "=" instead of "::=" and "|" doesn't exist in Plof; multiple productions must be used instead (you'll see why this is in a second). Also, nonterminals aren't inside < and > like they are in BNF. Here's an example from plofcore/src/pul/boolean_g.plof defining the conditional "or":

plof_or = plof_or "\|\|" plof_or_next => plof {
    $0.opOr($2)
}
plof_or = plof_or_next => plof { $0 }
plof_or => plof_and

Now I should probably explain a few things. plof_or and plof_or_next are nonterminal expressions. "\|\|" doesn't actually match those exact characters, because string terminals are actually regexes -- "\" is the escape character, so "\|\|" will match "||", which is or in Plof.

The second thing to notice is the "=>". This causes code to be executed whenever the production is matched. This is why there is no "|", because there is always different code to be run for each eventuality. Next, notice where it says "plof { ... }". This is the code that is to be run when the production is matched. The word "plof" must be specified for Plof code because the code is in PSL (Plof Stack Language) by default, which is described in detail in the specification. Also, $0, $1, $2, etc. are all special variables that signify the subproductions. So in the first production shown, $0 would be the parsed plof_or, $1 would be the parsed "\|\|", and $2 would be the parsed plof_or_next.

Now I should probably give an explanation of just how this all works. The third part, plof_or => plof_and, is a shortcut for plof_or_next = plof_and. It has no similarity whatsoever to the "=>" described previously. Why have *_next, you might ask? Well, this is very important for Plof's grammar hierarchy. So plof_or is, as shown, higher in precedence than plof_and. Both nonterminal expressions will work their way down the hierarchy until they match an expression that fits the first production; for example, (5 > 3) || (2 > 3) will search for plof_or on the left and plof_or_next on the right. (5 > 3) will be searched for plof_or, then plof_or_next, always getting lower into the hierarchy until a match is found.

The first production ties into this concept of a hierarchy recursively. Since the order of operations is always left to right, we define plof_or as plof_or "\|\|" plof_or_next, so that the nonterminals with lower precedence are on the right. Of course, recursion cannot go on forever. That's why we have the second production, which just returns the expression. By the way, productions are always parsed top to bottom, meaning that the second production won't be used until the first production does not match.

Now that you (hopefully) understand how this works, let's try a "Hello, world!" example (yes, I know, I'm obsessed with this theme =D). We'll change Plof's grammar so that "greeting@@thing" will print the greeting to standard output. Here goes:

grammar
{
    plof_or => plof_greeting
    plof_greeting = plof_greeting "@@" plof_greeting_next => plof {
        Stdout.write($0 ~ ", " ~ $2 ~ "!\n")
    }
    plof_greeting = plof_greeting_next => plof {$0}
}

"Hello"@@"world"

Okay, this should be understandable save for one thing: that plof_greeting_next doesn't seem to be defined. This is because plof_or_next was already defined. Then we defined it again as plof_greeting, so plof_greeting_next became the previous plof_or_next (plof_and). This allows for new operators to be added into the hierarchy easily.

Okay, so now you know how to define new elements in Plof's grammar. This is (at least for now) the last of my Plof tutorials. You should right now have a firm grasp on Plof and its flexibility. Happy coding!