Possum DevLog 3: The Grammar

I started this to discuss certain questions i have about the grammar :slight_smile:

There is

variableDecl : 
	  VAR identifier (COMMA identifier)* (EQUAL expression)? TERMINATOR

which means you could not write

     VAR d1 = 123, d2 = 4345, d3 = 456

is that intentional ?

Yeah. I’m trying to keep the language usable but simple. The grammar permits multiple variables to be assigned to a value but not multiple variables to multiple values on the same line. Although not required, Possum does recognise semicolons. So you could do this:

var d1 = 123
var d2 = 4345
var d3 = 456

Or:

var d1 = 123; var d2 = 4345; var d3 = 456

just double checking :slight_smile:

Much appreciated. You have a knack for spotting bugs in my code! :laughing:

I think I have found some unexpected issues

for instance this seems legal from the grammar so far

var foo, bar = += %=
by following

program : declaration* EOF
declaration : 
      variableDecl
variableDecl : 
	  VAR identifier (COMMA identifier)* (EQUAL expression)? TERMINATOR
expression : 
	assignment
assignment : 
	  nilCoalesce (EQUAL | PLUS_EQUAL | MINUS_EQUAL | STAR_EQUAL | SLASH_EQUAL | PERCENT_EQUAL) assignment
	| nilCoalesce

because assignment references itself it permits an endless list of += %= += etc

I dont think thats intended

seems that there need to be a distinction in some places so a while uses a boolean_expression as does an if or the ternary and others like assignment can use other kinds of more generalized expressions

unless your intent is to do something like in C where any kind of expression has a “value” that can be treated as a true / false ?

Bingo. Like Ruby. Only false and nothing are “falsy”, everything else is “truthy”.

Nope shouldn’t do. assignment doesn’t reference itself, it cascades into nilcoalesce and onwards. Eventually it’ll trickle down to primary which means that at least a primary production would need to appear before EQUAL, PLUS_EQUAL, MINUS_EQUAL, STAR_EQUAL, SLASH_EQUAL or PERCENT_EQUAL. It has to match nilcoalesce or a higher precedence rule before it consumes another assignment token.

At least that’s the intention :thinking:

I was just going by this grammar rule

(EQUAL | PLUS_EQUAL | MINUS_EQUAL | STAR_EQUAL | SLASH_EQUAL | PERCENT_EQUAL) assignment
| nilCoalesce

so it seems to refer to itself and permit an endless stream of += = %= etc

fwiw this is why I like this all in one relatively compact plain text file
it makes scooting around to check things out really quick

But the rule is:

nilcoalesce (EQUAL | PLUS_EQUAL | MINUS_EQUAL | STAR_EQUAL | SLASH_EQUAL | PERCENT_EQUAL) assignment | nilCoalesce

Note the leading nilcoalesce rule before the assignment terminals.

It’s probably my fault - I may not have chosen the most conventional way of writing out the productions.

Or, quite possibly, I’m wrong :slight_smile:

In plain speak. There are two options to this rule:

Do the nilcoalesce rule then expect one of the assignment tokens then another assignment rule

or

Just do the nilcoalesce rule.

right but nilcoalesce (equal plus etc) assignment allows
nilcoalesce followed by one of the = += etc
tnen another assignment

which sound to me very clearly NOT what you intended

You are of course right :man_facepalming:

I think I mean this:

nilcoalesce ((EQUAL | PLUS_EQUAL | MINUS_EQUAL | STAR_EQUAL | SLASH_EQUAL | PERCENT_EQUAL) nilcoalesce)?

As in: "do the nilcoalesce rule, then optionally consume at most one assignment token and another nilcoalesce rule.

I’ll have another read
I might have more suggestions :slight_smile:

1 Like

One thing I noticed is that you use a “regex” style for writing the grammar
Thats a tad harder than it needs to be to write code almost directly from
Often for a"a list of" in many grammars you see a construction like the following (for a possum var declaration line)

variableDecl :
     var varDeclList optInitializer TERMINATOR

varDeclList :
	  identifier 
	| varDeclList ',' identifier

optInitializer :
         /* empty string */
       | '=' initializer 

which would make the following legal (and decently easy to write code for)

    var x   
    var y, z = 123

and I think this is consistent with the design you’re shooting for ?

This is what happens when you teach yourself language design!

I cobbled together the RegEx nomenclature just to have a compact representation but it might be easier to refactor it your way. Thanks.

You’ve done a decent job so far :slight_smile:

There are some fun “how I designed and implemented this language” tutorials around
Dunno if those are of interest ?

Thanks!

Yeah I’ve gobbled up a lot of them. I even went so far as to buy and “read” the dragon compiler book. Man, that’s a little dry…

:laughing: :laughing: :laughing: :laughing: :laughing:
dry ??

although its not considered “current best practices” its good for the theory of compiler construction