Tips and Tricks

Context-Sensitive Tokenization, Next Installment, Activating and De-activating Tokens

Sometimes, when you complete a major code cleanup, features that were previously pie in the sky become low-hanging fruit to pluck. The new feature that I describe here, the ability to activate and deactivate tokens is such a case. It resulted from my rewriting of the lexical code generation that I describe here. In an …

Context-Sensitive Tokenization, Next Installment, Activating and De-activating Tokens Read More »

Context-Sensitive Tokenizing, Part Deux: Lexical States

(To get some prerequisite understanding of this topic, it might be a good idea to read this earlier blog post on context-sensitive tokenization from three months ago.) The Lay of the Land There are two quite useful ideas that have been in JavaCC from the very beginning: lookahead (particularly syntactic lookahead) lexical states Syntactic lookahead …

Context-Sensitive Tokenizing, Part Deux: Lexical States Read More »

Tree Building Redux: Nailing another Dmitry Dmitriyevich problem

Greetings, comrades! My name is Vladimir Vladimirovich Vladimirov! Hey, whassup, Vlad! As a follow-on to my blog post of a couple of days ago there were a couple of t’s that needed crossing and an ‘i’ or two that needed a dot. Let’s see… First of all, I misspoke a little bit in that post. …

Tree Building Redux: Nailing another Dmitry Dmitriyevich problem Read More »

Tastes just like home-made! (Some more tree building enhancements)

Before getting into what the minor enhancements to tree building are, I guess I should write a quick synopsis of the current state of affairs. When you have TREE_BUILDING_ENABLED set to true (this is the default in JavaCC21) the tree building machinery will build a Node if the production results in the creation of more …

Tastes just like home-made! (Some more tree building enhancements) Read More »

“You can’t get there from here!” — The Problem of Context-Sensitive Tokenization

(N.B. Note added 13 June 2021: This article is useful in terms of understanding how to add token hooks to code. However, in terms of solving the specific problem outlined, the article is obsolete. See here for the updated solution.) Since I picked up my work on the JavaCC codebase at the end of 2019, …

“You can’t get there from here!” — The Problem of Context-Sensitive Tokenization Read More »

Token Hooks (CommonTokenAction) Revisited

In the beginning there was… CommonTokenAction Legacy JavaCC had (and still has) a means of applying whatever adjustments (a.k.a. kludges) to a Token just before it is handed off to the parser machinery. You could define a method called CommonTokenAction in your Lexer TokenManager class and this method is invoked when you get another Token …

Token Hooks (CommonTokenAction) Revisited Read More »