Add a Pygments Lexer to Chroma
- Introduction
- Setup
- Convert
a Pygments lexer to a Chroma lexer with
pygments2chroma_xml.py
- Highlight some code with a Chroma lexer
- Add test data
- Record test output
- Run tests
- Conclusion
- Bonus!:
Use local Pygments with
pygments2chroma_xml.py
Introduction
Gitea uses Chroma for syntax highlighting. Chroma is based on the Python syntax highlighter, Pygments, and includes a script to help convert Pygments lexers for use with Chroma. We describe how below.
Setup
We’re going to be using the python
and
golang
Docker
images. Docker Desktop is not required.
$ docker pull python
$ docker pull golang
Let’s set up some aliases to make running the commands easier.
$ alias docker-run='docker run --rm -it -w /opt -v $PWD:/opt'
$ alias docker-run-go='docker-run golang'
$ alias docker-run-py='docker-run python'
Convert
a Pygments lexer to a Chroma lexer with
pygments2chroma_xml.py
$ git clone https://github.com/alecthomas/chroma.git
$ cd chroma
In the Chroma root directory, we run:
$ docker-run-py bash -c \
"pip install pystache pygments && \
python _tools/pygments2chroma_xml.py \
pygments.lexers.scripting.LuaLexer > lexers/embedded/lua.xml && \
pip list"
We should see this in the output:
Package Version
-------- -------
pip 25.0.1
Pygments 2.19.2
pystache 0.6.8
This just helps us know what version of Pygments we generated our
lexer from. The file lexers/embedded/lua.xml
should now
contain all the tokenization rules for the Lua language.
lexers/embedded/lua.xml
<lexer>
<config>
<name>Lua</name>
...
Highlight some code with a Chroma lexer
Chroma provides a simple
example test file we can modify to see what syntax highlighting with
our new lexer looks like. First, though, we need to create a new Go
module by running go mod init
:
$ cd ..
$ docker-run-go go mod init main
go: creating new go.mod: module main
go: to add module requirements and sums:
go mod tidy
We will need required modules, so let’s go ahead and run
go mod tidy
as the output suggests.
$ docker-run-go go mod tidy
We should now have 2 additional files, go.mod
and
go.sum
. go.sum
has some package hashes while
go.mod
should look like this:
go.mod
module main
go 1.25
require github.com/alecthomas/chroma/v2 v2.18.0
require github.com/dlclark/regexp2 v1.11.5 // indirect
Now we can create a main.go
file and copy over the code
from Chroma’s example test file, but we update the code
variable with some Lua, print("hello")
, and the lexer we
pass into the Highlight
function is changed to
"lua"
:
main.go
package main
import (
"log"
"os"
"github.com/alecthomas/chroma/v2/quick"
)
func main() {
code := `print("hello")`
err := quick.Highlight(os.Stdout, code, "lua", "html", "monokai")
if err != nil {
log.Fatal(err)
}
}
Now we can try running our main.go
like this:
$ docker-run-go go run main.go
go: downloading github.com/alecthomas/chroma/v2 v2.18.0
go: downloading github.com/dlclark/regexp2 v1.11.5
<html>
<style type="text/css">
...
And that should output markup (and styles) for highlighting that
block of Lua code to the console. But if we notice, it’s importing the
Chroma package from the GitHub repo. If we want to use a local version
of Chroma, we have to use a replace
directive to import Chroma from our local directory:
$ docker-run-go go mod edit -replace \
github.com/alecthomas/chroma/v2@v2.18.0=./chroma
Which adds this line to our go.mod
file:
go.mod
...
replace github.com/alecthomas/chroma/v2 v2.18.0 => ./chroma
Now, when we run main.go
, we should no longer see Chroma
being imported, because it’s using our local copy:
$ docker-run-go go run main.go
go: downloading github.com/dlclark/regexp2 v1.11.5
<html>
<style type="text/css">
...
We should also see a list of styles followed by the HTML markup for highlighting our Lua code (formatted for legibility):
<pre class="chroma">
<code>
<span class="line">
<span class="cl">
<span class="n">print</span>
<span class="p">(</span>
<span class="s2">"hello"</span>
<span class="p">)</span>
</span>
</span>
</code>
</pre>
Add test data
If we want to add our lexer to Chroma, we will need to create some
test data for it. We can create a file in lexers/testdata
called lua.actual
and add the language tokens to it.
Record test output
Once we have test data, we need to record the expected output. We
create another file called lexers/testdata/lua.expected
.
This is the file we will record to by running the following command from
the Chroma root directory:
$ docker-run -e RECORD=true golang go test ./lexers
Once test output is recorded in
lexers/testdata/lua.expected
, we should visually inspect
and verify that the expected data is correct.
Run tests
As a final confirmation, we can run the tests to make sure we have not broken anything:
$ docker-run-go go test ./lexers
Conclusion
If we followed all these steps correctly, our lexer should be ready
to be pushed to a git
repo and for us to open a pull
request!
Bonus!:
Use local Pygments with pygments2chroma_xml.py
These lines in pygments2chroma_xml.py
,
import pystache
from pygments import lexer as pygments_lexer
from pygments.token import _TokenType
import Pygments from the Python Package
Index. But, if we want to convert a Pygments lexer from a local
git
repo, we can import it by simply running the
pygments2chroma_xml.py
script from the repo root
directory.
$ git clone https://github.com/pygments/pygments.git
$ cd pygments
$ docker-run \
-v ../chroma/_tools/pygments2chroma_xml.py:/opt/pygments2chroma_xml.py \
python bash -c \
"pip install pystache && \
python pygments2chroma_xml.py pygments.lexers.scripting.LuaLexer && \
pip list"
We should see the lexer output followed by
Package Version
-------- -------
pip 25.0.1
pystache 0.6.8
which indicates no remote pygments
package was
installed.