This page is part of multi-step Custom Language Support Tutorial. All previous steps must be executed in sequence for the code to work.
The lexical analyzer defines how the contents of a file are broken into tokens, which is the basis for supporting custom language features. The easiest way to create a lexer is to use JFlex.
Define a LexerDefine a Simple.flex
file with rules for the Simple Language lexer in package org.intellij.sdk.language
.
// Copyright 2000-2022 JetBrains s.r.o. and other contributors. Use of this source code is governed by the Apache 2.0 license that can be found in the LICENSE file. package org.intellij.sdk.language; import com.intellij.lexer.FlexLexer; import com.intellij.psi.tree.IElementType; import org.intellij.sdk.language.psi.SimpleTypes; import com.intellij.psi.TokenType; %% %class SimpleLexer %implements FlexLexer %unicode %function advance %type IElementType %eof{ return; %eof} CRLF=\R WHITE_SPACE=[\ \n\t\f] FIRST_VALUE_CHARACTER=[^ \n\f\\] | "\\"{CRLF} | "\\". VALUE_CHARACTER=[^\n\f\\] | "\\"{CRLF} | "\\". END_OF_LINE_COMMENT=("#"|"!")[^\r\n]* SEPARATOR=[:=] KEY_CHARACTER=[^:=\ \n\t\f\\] | "\\ " %state WAITING_VALUE %% <YYINITIAL> {END_OF_LINE_COMMENT} { yybegin(YYINITIAL); return SimpleTypes.COMMENT; } <YYINITIAL> {KEY_CHARACTER}+ { yybegin(YYINITIAL); return SimpleTypes.KEY; } <YYINITIAL> {SEPARATOR} { yybegin(WAITING_VALUE); return SimpleTypes.SEPARATOR; } <WAITING_VALUE> {CRLF}({CRLF}|{WHITE_SPACE})+ { yybegin(YYINITIAL); return TokenType.WHITE_SPACE; } <WAITING_VALUE> {WHITE_SPACE}+ { yybegin(WAITING_VALUE); return TokenType.WHITE_SPACE; } <WAITING_VALUE> {FIRST_VALUE_CHARACTER}{VALUE_CHARACTER}* { yybegin(YYINITIAL); return SimpleTypes.VALUE; } ({CRLF}|{WHITE_SPACE})+ { yybegin(YYINITIAL); return TokenType.WHITE_SPACE; } [^] { return TokenType.BAD_CHARACTER; }
Generate a Lexer ClassNow generate a lexer class via from the context menu on Simple.flex file.
Users from China, please see important configuration.
The Grammar-Kit plugin uses the JFlex lexer generation. When running for the first time, JFlex prompts for a destination folder to download the JFlex library and skeleton. Choose the project root directory, for example code_samples/simple_language_plugin.
After that, the IDE generates the lexer under the gen directory, for example in simple_language_plugin/src/main/gen/org/intellij/sdk/language/SimpleLexer.
Gradle Grammar-Kit Plugin can be used alternatively.
Define a Lexer AdapterThe JFlex lexer needs to be adapted to the IntelliJ Platform Lexer API. Implement SimpleLexerAdapter
by subclassing FlexAdapter
.
public class SimpleLexerAdapter extends FlexAdapter { public SimpleLexerAdapter() { super(new SimpleLexer(null)); } }
Define a Root FileThe SimpleFile
implementation is the top-level node of the tree of PsiElements
for a Simple Language file.
public class SimpleFile extends PsiFileBase { public SimpleFile(@NotNull FileViewProvider viewProvider) { super(viewProvider, SimpleLanguage.INSTANCE); } @NotNull @Override public FileType getFileType() { return SimpleFileType.INSTANCE; } @Override public String toString() { return "Simple File"; } }
Define Token SetsDefine all sets of related token types from SimpleTypes
in SimpleTokenSets
.
public interface SimpleTokenSets { TokenSet IDENTIFIERS = TokenSet.create(SimpleTypes.KEY); TokenSet COMMENTS = TokenSet.create(SimpleTypes.COMMENT); }
Define a ParserThe Simple Language parser is defined in SimpleParserDefinition
by subclassing ParserDefinition
. To avoid unnecessary classloading when initializing the extension point implementation, all TokenSet
return values should use constants from dedicated $Language$TokenSets
class.
final class SimpleParserDefinition implements ParserDefinition { public static final IFileElementType FILE = new IFileElementType(SimpleLanguage.INSTANCE); @NotNull @Override public Lexer createLexer(Project project) { return new SimpleLexerAdapter(); } @NotNull @Override public TokenSet getCommentTokens() { return SimpleTokenSets.COMMENTS; } @NotNull @Override public TokenSet getStringLiteralElements() { return TokenSet.EMPTY; } @NotNull @Override public PsiParser createParser(final Project project) { return new SimpleParser(); } @NotNull @Override public IFileElementType getFileNodeType() { return FILE; } @NotNull @Override public PsiFile createFile(@NotNull FileViewProvider viewProvider) { return new SimpleFile(viewProvider); } @NotNull @Override public PsiElement createElement(ASTNode node) { return SimpleTypes.Factory.createElement(node); } }
Register the Parser DefinitionRegistering the parser definition in the plugin.xml file makes it available to the IntelliJ Platform. Use the com.intellij.lang.parserDefinition
extension point for registration. For example, see simple_language_plugin/src/main/resources/META-INF/plugin.xml.
<extensions defaultExtensionNs="com.intellij"> <lang.parserDefinition language="Simple" implementationClass="org.intellij.sdk.language.SimpleParserDefinition"/> </extensions>
Run the ProjectRun the plugin by using the Gradle runIde
task.
Create a test.simple file with the following content:
# You are reading the ".properties" entry. ! The exclamation mark can also mark text as comments. website = https://en.wikipedia.org/ language = English # The backslash below tells the application to continue reading # the value onto the next line. message = Welcome to \ Wikipedia! # Add spaces to the key key\ with\ spaces = This is the value that could be looked up with the key "key with spaces". # Unicode tab : \u0009
Use the PsiViewer plugin or built-in PSI viewer and check how the lexer breaks the content of the file into tokens, and the parser transforms the tokens into PSI elements.
16 April 2025
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4