Tips and Tricks

Context-Sensitive Tokenizing, Part Deux: Lexical States

(To get some prerequisite understanding of this topic, it might be a good idea to read this earlier blog post on context-sensitive tokenization from three months ago.) The Lay of the Land There are two quite useful ideas that have been in JavaCC from the very beginning: lookahead (particularly syntactic lookahead) lexical states Syntactic lookahead …

Context-Sensitive Tokenizing, Part Deux: Lexical States Read More »

Tree Building Redux: Nailing another Dmitry Dmitriyevich problem

Greetings, comrades! My name is Vladimir Vladimirovich Vladimirov! Hey, whassup, Vlad! As a follow-on to my blog post of a couple of days ago there were a couple of t’s that needed crossing and an ‘i’ or two that needed a dot. Let’s see… First of all, I misspoke a little bit in that post. …

Tree Building Redux: Nailing another Dmitry Dmitriyevich problem Read More »

Tastes just like home-made! (Some more tree building enhancements)

Before getting into what the minor enhancements to tree building are, I guess I should write a quick synopsis of the current state of affairs. When you have TREE_BUILDING_ENABLED set to true (this is the default in JavaCC21) the tree building machinery will build a Node if the production results in the creation of more …

Tastes just like home-made! (Some more tree building enhancements) Read More »

“You can’t get there from here!” — The Problem of Context-Sensitive Tokenization

Since I picked up my work on the JavaCC codebase at the end of 2019, various people have broached to me this question of strings that can (or should) be broken into tokens differently based on the context where they are encountered. I have to admit that it took me a while to grasp just …

“You can’t get there from here!” — The Problem of Context-Sensitive Tokenization Read More »

Token Hooks (CommonTokenAction) Revisited

In the beginning there was… CommonTokenAction Legacy JavaCC had (and still has) a means of applying whatever adjustments (a.k.a. kludges) to a Token just before it is handed off to the parser machinery. You could define a method called CommonTokenAction in your Lexer TokenManager class and this method is invoked when you get another Token …

Token Hooks (CommonTokenAction) Revisited Read More »