Bugfix! Nested syntactic lookahead now works correctly!

Fixed a Longstanding Bug Known Issue in JavaCC: Nested syntactic lookahead now works!

There is a quite serious bug (let’s just call it by what it is, shall we?) of very long standing in JavaCC. Basically, syntactic lookaheads do not nest. This bug is documented in Theodore Norvell’s JavaCC FAQ, which by the way, is the best documentation for JavaCC – at least that is available for free. It is question 4.8 in Professor Norvell’s FAQ:

Are nested syntactic lookahead specifications evaluated during syntactic lookahead?

In a word, the answer to the question is no and Theo gives a concrete example, that serves as a nice little testcase for our purposes here. I’ve simplified it a little bit, since there is no need for the trivial productions w()x(), and y(). Here it is:

PARSER_BEGIN(TestParser)
    import java.io.*;
    public class TestParser {
        static public void main(String[] args) throws Exception {
        TestParser parser = new TestParser(new StringReader("wxy"));
        parser.start();
    }
}
PARSER_END(TestParser)

void start( ) : { } 
{
    LOOKAHEAD ( a() )
    a() {System.out.println("Nested lookahead successful!");}
    < EOF >
    |
    "w"
    <EOF>
}

void a( ) : { } 
{
    (
        LOOKAHEAD ( "w" "y" ) "w"
        |
       "w" "x"
    )
    "y"
}

Since the above example uses entirely legacy syntax, it builds with either legacy JavaCC or JavaCC21. Just drop the file (Test.jj or whatever you want to call it) in an empty directory and execute the following commands:

$ javacc Test.jj
$ javac *.java
$ java TestParser

If you use legacy JavaCC in the above, it fails just as Theo describes in his FAQ. I just tried it with JavaCC 7.0.10 and it results in this:

Exception in thread "main" ParseException: Encountered " "x" "x "" at line 1, column 2.
Was expecting:
    <EOF>

        at TestParser.generateParseException(TestParser.java:377)
        at TestParser.jj_consume_token(TestParser.java:243)
        at TestParser.start(TestParser.java:19)
        at TestParser.main(TestParser.java:7)
    

If you do the above steps with the latest version of JavaCC21, it gives you this output:

 Nested lookahead successful!

Yes, JavaCC 21 evaluates nested syntactic lookahead! So, I guess that Theo, or whoever is maintaining the FAQ nowadays, really ought to replace the answer to question 4.8. The answer should change from:

 No.

to:

If you are using *legacy JavaCC*, no, nested 
syntactic lookahead is not evaluated. BUT if 
you are using the updated version, JavaCC 21,
syntactic lookahead is evaluated. 
(*Hallelujah!*)

Tangent: When does a bug becomes a known issue?

I reckon that, once a bug reaches a certain age in a well known software tool, it graduates from being a bug to a known issue. This particular bug known issue in JavaCC is about 24 years old. It always worked this way and nobody ever fixed it. Granted, when Professor Norvell wrote the FAQ entry on that, the bug was perhaps (just guessing…) only 10 years old. It was still an issue that had existed from the very beginning with JavaCC and there was absolutely zero prospect (or so it seemed) of anybody ever fixing this.

What I found noteworthy about all of this is the way Professor Norvell approaches this. At no point in the FAQ entry does he say straightforwardly. "This is a pretty major bug and somebody really ought to fix this!

No, he simply documents the behavior (as if it were completely normal!) and provides some possible workarounds for various cases where this is a problem. But he certainly does not use the dreaded B-word in describing the situation. Actually, I noted that the word “bug” only occurs twice in the JavaCC FAQ, once in answer to FAQ 1.7 where he says:

If you found a bug in JavaCC, please open an issue.

If you found a bug

Well, gee whiz Theo, what about the fact that nested syntactic lookahead doesn’t work? Duhhh…

Well, granted, if he (or anybody) did “open an issue”, the ostensible maintainers of the legacy JavaCC project would surely respond:

You see.... this is a known issue.

(Right? Been there… Done that, eh?)

You see, this is actually a key rhetorical trick, a feature of the sociocultural phenomenon that I call nothingburgerism. Once you call something that is obviously a bug a known issue, and that “issue” is documented, then it’s no longer a bug, since the software is behaving exactly as documented.

See the sleight of hand there?

Well, over the last few months, I’ve thought a lot about nothingburgerism and the way people seem to enable and foster it. The above-mentioned FAQ entry is an example of this. The FAQ maintainer, Professor Norvell, surely knows that this is a severe bug in JavaCC, but he makes the decision (consciously or not, but I suspect it’s not even conscious) never to refer to it simply as a bug that needs to be fixed. He simply documents this screwy behavior, offers some convoluted workarounds to what is obviously a bug, but you can be sure that he would never call out the ostensible maintainers of the project for not having fixed such an obvious bug for so many years! So...

If you find a bug in JavaCC, report it to Homeland Security!

If you see anythung suspicious at the airport, report it to the JavaCC devs...

Well, this is not about Professor Norvell in particular, mind you. It is really part of a much much larger cultural phenomenon, where people won’t tell the truth straightforwardly about things. To say that this is just a bug that ought to be fixed would just be too raw and it might hurt somebody’s feelings…

This is understandable, and we have all (including me even) declined to tell the full truth about certain things because it might offend somebody. However, I would point out that this is precisely what makes a phenomenon like nothingburgerism possible.

Food for thought…

Notable Replies

  1. FIrst of all, there was an answer to this post, but it disappeared. The reason for that is that somehow the database got corrupted and I had to re-install the Discourse forum. (Well, hey, this is to be expected when your website has so many concurrent connections, eh?) Anyway, the last automated backup was from 13 July so any responses after that point were lost. Anyway, the note was from user Mama (a.k.a. Marc Mazas):

    Well… Congratulations for having fixed this somewhere in the desert… but shame if you do not fix it in JavaCC (you would be rewarded by some consideration and listening by the JavaCC community)

    A rather strange comment really. Well, looking on the positive side, it seems that Mama (Marc) realizes that this is indeed a pretty major bugfix. He does not start some nonsense trying to say that the way this worked before in JavaCC actually made any sense!

    So I am at least thankful for that!

    On the other hand, it is a rather usettling comment on another level, because it just seems that, on some fundamental level, you just don’t quite get it, Marc. (I really see no way to put this any more diplomatically.) In the following, I shall address Marc Mazas…

    I was trying to put my finger on what the issue is and I guess the best way i could characterize is that you seem to have what one might call a “Charlie Brown issue.” Surely, despite being a Frenchman, you understand the allusion to the famous Peanuts or Snoopy comic strip. One of the running jokes is that there is this nasty little girl, Lucy, who is always encouraging Charlie to run up and kick a football and she always pulls it away at the last moment. And, naturally, Charlie Brown falls flat on his ass. The running joke is that Charlie Brown never learns the lessons. Some, every time, he believes that, this time Lucy is going to let him kick the football!

    If anything, this situation is more glaring than even the Charlie Brown example. Charlie Brown, after all, is just a little kid, who presumably grows up after a certain number of years. (Though he never does grow up, because he is a comic book little kid. But never mind that.)

    But, here we have a situation where this particular bug, nested lookaheads not working, is about 24 years old. That is quite a big older than my teenage daughter. Well, okay, JavaCC has only been open source since 2003, and that would maybe be the point from which to start the count. So, only 17 years then…

    If the maintainers of the legacy JavaCC project were really interested in fixing bugs like this, or really, in doing much of anything at all, surely in 17 years, they would have done something, no? If it’s been 17 years and Lucy never let Charlie Brown kick that football, even once, then shouldn’t your baseline assumption be that she has no intention of ever letting him kick the ball?

    You see, there are technical issues and then there are non-technical issues – the ones that relate to people’s psychology or culture… I would have to admit that the latter kind of issue is hardly my strong point. It is very difficult for me to understand the mentality of these sorts of people. Like, why would one get involved in an open source project like JavaCC, and adopt this kind of gatekeeper stance?

    I just don’t really understand this. Not really… But one thing I do understand is that you can’t just go there and offer to do some work in a project like that, a nothingburger project basically – fix longstanding bugs, implement much-needed features like INCLUDE or INJECT… – not any more than Charlie Brown can run up and kick the football!

    It’s just not going to happen!

    Basically, if you really don’t understand this, Marc, it’s high time that you do understand it.

    Now, that said, you do make a valid point, which is that, yes, it is a pity that (at least in the near-term) the work I do on this, like fixing this longstanding bug, will benefit relatively few people. I implemented INCLUDE and INJECT over a decade ago and quite few people ever got the benefit of that work.

    And that really is very annoying!

    But it is hardly my fault. I offered this work to the JavaCC community back in 2008 and, not only did they decline the proposal, but they mounted a character assassination campaign against me! Of course, it’s quite understandable in a way. It had already been 5 years since JavaCC was open sourced and they had done absolutely nothing with it. To have somebody show up and implement these sorts of new features does not make them look very good.

    Now, another 12 years have passed and the legacy JavaCC project still does not have an INCLUDE directive. This is particularly shocking since, of all possible things on a wish list, this is surely about the easiest to implement. It’s probably a one-day mini-project, if that.

    But, anyway, getting back to your comment, that “it would be a shame if I did not fix it in JavaCC.”

    This is really a very obtuse, provocative comment, and I really have to say that you must refrain from this sort of behavior.

    I DID FIX IT IN JAVACC!!!

    This project, JavaCC 21, hosted on https://javacc.com IS JavaCC!!!

    That is my position. Well, okay, the legacy JavaCC project, hosted on https://javacc.org/ is also JavaCC, and my this version of JavaCC has the same relationship to that version of JavaCC that the regular olympics has to the “special” olympics. They are both the “Olympic Games” but…

    There is no real objective, technical grounds for saying that the obsolete codebase at https://github.com/javacc/javacc in the pitiful neglected state that it is in, is the real JavaCC any more than the actively developed, cleaned-up codebase at https://github.com/javacc21/javacc21

    Not only are there zero technical grounds for arguing that their project is the one, sole, real JavaCC, but there are zero legal grounds either. None of them ever registered JavaCC as a trademark. When Sun Microsystems put Sreeni in charge of transitioning the thing to open source, he was not being given the project in perpetuity. He was being designated as the custodian of a public good. And then, as you surely know, he did not do the absolutely minimal amount of work that that role would entail!

    There is no reason for any person in the know to use legacy JavaCC when they can use JavaCC 21, which is simply the more advanced, and actively maintained and developed version of the same tool!

    As for my work having relatively little visibility at the moment, my “being in the desert”, this is a valid point, however, but we should all do the best we can to make the situation temporary. Regardless, it is unfortunate that this whole body of work, INJECT, INCLUDE, fixing very basic bugs… that this benefits relatively few people. But again,

    This is NOT MY FAULT!

    There is a very clear electronic record of this that can be found here

    The electronic record is absolutely clear. I tried and tried to donate my ongoing work to the JavaCC project. They simply were not interested. And let’s be clear about this. The character assassination attack they mounted was a smokescreen. They were not specifically uninterested in my work because of my deplorable character. They were not interested in any ongoing work by anybody because it would make them look bad and this was (and is) a classic nothingburger project. If you really think it is my fault that nobody got the benefits of all that work I did (and have resumed) then, unfortunately, you are a fool. In that case, the comic book character Charlie Brown is a genius next to you!

Continue the discussion at parsers.org

Participants