Welcome to pyparsing-highlighting’s documentation!¶
Syntax highlighting with pyparsing, supporting both HTML output and prompt_toolkit–style terminal output. The PPHighlighter
class can also be used as a lexer for syntax highlighting as you type in prompt_toolkit. It is compatible with existing Pygments styles.
The main benefit of pyparsing-highlighting over Pygments is that pyparsing parse expressions are both more powerful and easier to understand than Pygments lexers. pyparsing implements parsing expression grammars using parser combinators, which means that higher level parse expressions are built up in Python code out of lower level parse expressions in a straightforward to construct, readable, modular, well-structured, and easily maintainable way.
See the official pyparsing documentation or my unofficial (epydoc) documentation.
Requirements¶
- Python 3.5+
Note that PyPy, a JIT compiler implementation of Python, is often able to achieve around 5x the performance of CPython, the reference Python implementation.
- pyparsing
- prompt_toolkit 2.0+
- Pygments (optional; needed to use Pygments styles)
Installation¶
pip3 install -U pyparsing-highlighting
Or, after cloning the repository on GitHub:
python3 setup.py install
(or, with PyPy):
pypy3 setup.py install
Examples¶
The following code demonstrates the use of PPHighlighter
:
from pp_highlighting import PPHighlighter
from prompt_toolkit.styles import Style
import pyparsing as pp
from pyparsing import pyparsing_common as ppc
def parser_factory(styler):
a = styler('class:int', ppc.integer)
return pp.delimitedList(a)
pph = PPHighlighter(parser_factory)
style = Style([('int', '#528f50')])
pph.print('1, 2, 3', style=style)
This prints out the following to the terminal:
The following code generates HTML:
pph.highlight_html('1, 2, 3')
The output is:
<pre class="highlight"><span class="int">1</span>, <span class="int">2</span>, <span class="int">3</span></pre>
There is also a lower-level API—pph.highlight('1, 2, 3')
returns the following:
FormattedText([('class:int', '1'), ('', ', '), ('class:int', '2'), ('', ', '), ('class:int', '3')])
A FormattedText
instance can be passed to prompt_toolkit.print_formatted_text()
, along with a Style
mapping the class names to colors, for display on the terminal. See the prompt_toolkit formatted text documentation and formatted text API documentation.
PPHighlighter
can also be passed to a prompt_toolkit.PromptSession
as the lexer
argument, which will perform syntax highlighting as you type. For examples of this, see examples/calc.py
, examples/json_pph.py
, examples/repr.py
, and examples/sexp.py
. The examples can be run by (from the project root directory):
python3 -m examples.calc
python3 -m examples.json_pph
python3 -m examples.repr
python3 -m examples.sexp
Error Handling¶
If the parse expression should fail to match, it will be tried again at successive locations until it succeeds. Text encountered during retrying will be passed through unstyled. For example:
from pp_highlighting import PPHighlighter
import pyparsing as pp
from pyparsing import pyparsing_common as ppc
def parser_factory(styler):
return styler('ansicyan', ppc.integer) + styler('ansired', ppc.identifier)
pph = PPHighlighter(parser_factory)
pph.print('1a 2b three 4c')
The output is:
Note that this parse expression does not explicitly match more than one integer/identifier pair. After it matches, it is retried on the space after the first pair, which fails, and then it is retried again starting on the first character of the second pair, which succeeds. It is then retried until it reaches 4c
, which succeeds.
It is often possible to take advantage of pyparsing-highlighting’s error handling to write a simplified parse expression that does not parse a language fully but which still does ‘lexer-like’ analysis in a way that is robust to errors, and which continues to work even while the user is still typing. examples/repr.py
is an example along these lines.
Testing¶
(From the project root directory):
To run the unit tests:
python3 -m unittest
To run the regression benchmark:
python3 -m tests.benchmark
Module pp_highlighting¶
Syntax highlighting for prompt_toolkit and HTML with pyparsing.
-
pp_highlighting.
dummy_styler
= <pp_highlighting.pp_highlighter.DummyStyler object>¶ An importable instance of
DummyStyler
to pass to parser factories.Type: DummyStyler
-
class
pp_highlighting.
DummyStyler
[source]¶ Bases:
pp_highlighting.pp_highlighter.Styler
A drop-in replacement for
Styler
which, when called, merely returns a copy of the given parse expression without capturing text or applying styles. To aid in testing whether a parser factory has been passed aDummyStyler
object,bool(DummyStyler())
isFalse
.-
__call__
(style, expr)[source]¶ Returns a copy of the given parse expression.
Parameters: - style (Union[pygments.token.Token, str]) – Ignored.
- expr (Union[pyparsing.ParserElement, str]) – Copied, unless it is a
string literal, in which case it will be wrapped by
pyparsing.ParserElement._literalStringClass
(defaultpyparsing.Literal
).
Returns: pyparsing.ParserElement – A copy of the input parse expression.
-
-
class
pp_highlighting.
PPHighlighter
(parser_factory, *, uses_pygments_tokens=False)[source]¶ Bases:
prompt_toolkit.lexers.base.Lexer
Syntax highlighting for prompt_toolkit and HTML with pyparsing.
This class can be used to highlight text via its
highlight()
method (forprompt_toolkit.print_formatted_text()
—see the prompt_toolkit documentation for details), itshighlight_html()
method, itsprint()
method, and by passing it as thelexer
argument to aprompt_toolkit.PromptSession
.-
__init__
(parser_factory, *, uses_pygments_tokens=False)[source]¶ Constructs a new
PPHighlighter
.You should supply a parser factory, a function that takes one argument and returns a parse expression.
PPHighlighter
will pass aStyler
object as the argument (seeStyler
for more details).Examples
>>> def parser_factory(styler): >>> a = styler('class:int', ppc.integer) >>> return pp.delimitedList(a) >>> pph = PPHighlighter(parser_factory) >>> pph.highlight('1, 2, 3') FormattedText([('class:int', '1'), ('', ', '), ('class:int', '2'), ('', ', '), ('class:int', '3')])
FormattedText
instances can be passed toprompt_toolkit.print_formatted_text()
.Parameters: - parser_factory (Callable[[Styler], pyparsing.ParserElement]) – The parser factory.
- uses_pygments_tokens (bool) – Whether or not the parser is styled using Pygments tokens.
Raises: ImportError
– Ifuses_pygments_tokens
isTrue
and Pygments is not installed.
-
highlight
(s)[source]¶ Highlights a string, returning a list of fragments suitable for
prompt_toolkit.print_formatted_text()
.Parameters: s (str) – The input string. Returns: prompt_toolkit.formatted_text.FormattedText – The resulting list of prompt_toolkit text fragments.
-
lex_document
(document)[source]¶ Takes a
Document
and returns a callable that takes a line number and returns a list of(style_str, text)
tuples for that line.- XXX: Note that in the past, this was supposed to return a list
- of
(Token, text)
tuples, just like a Pygments lexer.
-
highlight_html
(s, *, css_class='highlight')[source]¶ Highlights a string, returning HTML.
Only CSS class names are currently supported. Parts of the style string that do not begin with
class:
will be ignored. If there are dots in the class name, they will be turned into hyphens.Parameters: - s (str) – The input string.
- css_class (str) – The CSS class for the wrapping tag.
Returns: str – The generated HTML.
-
print
(*values, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, **kwargs)[source]¶ Highlights and prints the values to a stream, or to
sys.stdout
by default. It callsprompt_toolkit.print_formatted_text()
internally and takes the same keyword arguments as it (compatible with the builtinprint()
).Default values of keyword-only arguments:
print(*values, sep=' ', end='\n', file=sys.stdout, flush=False, style=None, output=None, color_depth=None, style_transformation=None, include_default_pygments_style=None)
-
-
class
pp_highlighting.
PPValidator
(expr, *, multiline=True, move_cursor_to_end=False)[source]¶ Bases:
prompt_toolkit.validation.Validator
A prompt_toolkit
Validator
for pyparsing.-
__init__
(expr, *, multiline=True, move_cursor_to_end=False)[source]¶ Constructs a new
PPValidator
.Parameters: - expr (pyparsing.ParserElement) – The parser to use for validation.
- multiline (bool) – Whether to include the line number in the error message.
- move_cursor_to_end (bool) – Whether to move the cursor to the end of the input if a non-pyparsing exception was raised during parsing.
-
-
class
pp_highlighting.
Styler
[source]¶ Bases:
object
Wraps pyparsing parse expressions to capture styled text fragments.
-
__call__
(style, expr)[source]¶ Wraps the given parse expression to capture the original text it matched, and returns the modified parse expression. The
style
argument can be either a prompt_toolkit style string or a Pygments token.Parameters: - style (Union[pygments.token.Token, str]) – The style to set for this text fragment, as a string or a Pygments token.
- expr (Union[pyparsing.ParserElement, str]) – The pyparsing parser to
wrap. If a literal string is specified, it will be wrapped by
pyparsing.ParserElement._literalStringClass
(defaultpyparsing.Literal
).
Returns: pyparsing.ParserElement – The wrapped parser.
-
delete
(loc)[source]¶ Removes the styled text fragment starting at a given location if it exists.
Parameters: loc (int) – The styled text fragment to delete’s start location.
-