Greenscreener
b4a123e817
- Inline groups - Raw partils - Fixed image paths - Overall revamped image processing - Untrusted partials
376 lines
12 KiB
Markdown
376 lines
12 KiB
Markdown
---
|
||
language: en
|
||
highlight-style: native
|
||
---
|
||
|
||
# Formátítko 2.0
|
||
A python program based on [pandoc](https://pandoc.org/) and its python library
|
||
[panflute](http://scorreia.com/software/panflute) for converting from markdown
|
||
to TeX and HTML with added fancy features like image processing, python-based
|
||
macros and much more.
|
||
|
||
## Requirements
|
||
This project requires `panflute 2.3.0` that itself requires `pandoc 3.0`. If the
|
||
version of `pandoc` doesn't match, very weird things can happen. ImageMagick and
|
||
Inkscape are used for image processing. Nodejs is used for KaTeX.
|
||
|
||
## Usage
|
||
```
|
||
usage: formatitko.py [-h] [-l IMG_LOOKUP_DIRS [IMG_LOOKUP_DIRS ...]] [-p IMG_PUBLIC_DIR] [-i IMG_WEB_PATH] [-w OUTPUT_HTML] [-t OUTPUT_TEX] input_filename
|
||
|
||
positional arguments:
|
||
input_filename The markdown file to process.
|
||
|
||
options:
|
||
-h, --help show this help message and exit
|
||
-l IMG_LOOKUP_DIRS [IMG_LOOKUP_DIRS ...], --img-lookup-dirs IMG_LOOKUP_DIRS [IMG_LOOKUP_DIRS ...]
|
||
Image lookup directories. When processing images, the program will try to find the image in them first. Always looks for images in the same folder as the markdown
|
||
file. (default: [])
|
||
-p IMG_PUBLIC_DIR, --img-public-dir IMG_PUBLIC_DIR
|
||
Directory to put processed images into. The program will not overwrite existing images. (default: public)
|
||
-i IMG_WEB_PATH, --img-web-path IMG_WEB_PATH
|
||
Path where the processed images are available on the website. (default: /)
|
||
-w OUTPUT_HTML, --output-html OUTPUT_HTML
|
||
The HTML file (for Web) to write into. (default: output.html)
|
||
-t OUTPUT_TEX, --output-tex OUTPUT_TEX
|
||
The TEX file to write into. (default: output.tex)
|
||
```
|
||
|
||
## Format
|
||
Formátítko uses all the default pandoc markdown extensions except for
|
||
definition lists and citations. It also adds its own custom features.
|
||
|
||
## Features
|
||
|
||
### Hiding and showing elements based on flags
|
||
|
||
Flags can be set in the Front Matter or with python code. Then, elements with
|
||
the `if` attribute will only be shown if the flag is set to True and elements
|
||
with the `ifn` attribute will only be show if the flag is not set to True.
|
||
|
||
**Example:**
|
||
|
||
```markdown {.group}
|
||
---
|
||
flags:
|
||
foo: true
|
||
---
|
||
[This will be shown]{if=foo}
|
||
|
||
[This will not be shown]{if=bar}
|
||
|
||
[This will be shown]{ifn=bar}
|
||
```
|
||
|
||
### Including other files
|
||
|
||
There are two ways of including files.
|
||
|
||
#### Importing
|
||
The first is importing, which only takes the state (defined commands, metadata,
|
||
etc.) from the file and any content is omitted. This is useful for creating
|
||
libraries of commands. The syntax is as follows:
|
||
|
||
[#test/empty.md]{}
|
||
|
||
The curly braces are required for pandoc to parse the import properly and should
|
||
be left empty.
|
||
|
||
#### Partials
|
||
Partials are the very opposite of imports, they have their own context, which
|
||
inherits everything from the context they're included in, but gets reset after
|
||
the file ends.
|
||
|
||
:::{partial=test/empty.md}
|
||
:::
|
||
|
||
If the `untrusted` attribute is set to True, the partial and all its children
|
||
will not be able to define commands or run inline blocks (but it will be able to
|
||
run commands defined in the parent). ^[Please don't trust this for any security
|
||
though, we're playing with *eval* fire, this will never be secure.]
|
||
|
||
You can also import raw HTML and TeX if you set the `type` attribute of the
|
||
partial to `tex` or `html`.
|
||
|
||
### Groups
|
||
|
||
Groups are pieces of markdown with their own sandboxed context, in other words,
|
||
inline partials. They function exactly the same as partials, namely can have
|
||
their own front matter.
|
||
|
||
```markdown {.group}
|
||
---
|
||
language: cs
|
||
---
|
||
OOOoo český mód
|
||
```
|
||
|
||
If you need to nest groups or have code blocks inside groups, you can increase
|
||
the amount of backticks around the outer block:
|
||
|
||
````markdown {.group}
|
||
```go
|
||
fmt.Pritln("owo")
|
||
```
|
||
````
|
||
|
||
Groups and partials are also enclosed in `\begingroup` and `\endgroup` in the
|
||
output TeX.
|
||
|
||
### Raw HTML and TeX ^[This is a pandoc feature]
|
||
If raw HTML or TeX is included in the markdown file, it will automagically pop
|
||
out into the respective output file.
|
||
|
||
<em style="color: red">red text</em>
|
||
|
||
\vskip1em
|
||
|
||
This has the advantage and disadvantage of being very *"automagic"*, which means
|
||
that for instance markdown inside HTML will still get interpreted as markdown.
|
||
It is however very very unreliable, so in most cases, you should use explicit
|
||
raw blocks with the unnamed attribute set to either `html` or `tex`. ^[Still a
|
||
pandoc feature.]
|
||
|
||
``` {=html}
|
||
<span style="color: red">red text</span>
|
||
```
|
||
|
||
### Running python code
|
||
|
||
Formátítko allows you to run Python code directly from your MD file. Any
|
||
`python` code block with the class `run` will be executed:
|
||
|
||
#### Context
|
||
|
||
You can access the current context using the `ctx` variable. The context
|
||
provides read/write access to the FrontMatter metadata. The context has the
|
||
following methods:
|
||
|
||
`ctx.get_metadata(key: str, simple: bool=True, immediate: bool=False)`
|
||
|
||
- `key`: The key of the metadatum you want to get. Separate child keys with
|
||
dots: `ctx.get_metadata("flags.foo")`
|
||
- `simple`: Whether to use python's simple builtin types or panflute's
|
||
MetaValues. MetaValues can contain formatted text, simple values loose all
|
||
formatting.
|
||
- `immediate`: Only get metadatum from the current context, not from its
|
||
parents.
|
||
|
||
`ctx.set_metadata(key: str, value)`
|
||
|
||
- `key`: The key of the metadatum you want to get. Separate child keys with
|
||
dots: `ctx.get_metadata("flags.foo")`
|
||
- `value`: Any value you want to assign to the metadatum
|
||
|
||
`ctx.unset_metadata(key: str)`
|
||
|
||
Delete the metadatum in the current context and allow it to inherit the value
|
||
from the parent context.
|
||
|
||
- `key`: The key of the metadatum you want to get. Separate child keys with
|
||
dots: `ctx.get_metadata("flags.foo")`
|
||
|
||
Helper functions for flags exist which work the same as for metadata:
|
||
|
||
`ctx.is_flag_set(flag: str) -> bool`
|
||
|
||
`ctx.set_flag(flag: str, val: bool)`
|
||
|
||
`ctx.unset_flag(flag: str)`
|
||
|
||
#### Writing output
|
||
|
||
There are two modes of writing output, plaintext and element-based.
|
||
|
||
Plaintext mode uses the `print(text: str)` and `println(text: str)` functions,
|
||
that append text to a buffer which is then interpreted as markdown input.
|
||
|
||
Element-based mode uses the `appendChild(element: pf.Element)` and
|
||
`appendChildren(*elements: List[pf.Element])` functions which allow you to
|
||
append `panflute` elements to a list which is then again interpreted as input.
|
||
The `panflute` library is available as `pf`.
|
||
|
||
When one of these functions is called, the mode is set and functions from the
|
||
other mode cannot be called within the same block of code.
|
||
|
||
**Examples:**
|
||
|
||
````markdown {.group}
|
||
---
|
||
title: Foo
|
||
---
|
||
```python {.run}
|
||
println("*wooo*")
|
||
println()
|
||
println("The title of this file is: " + ctx.get_metadata("title"))
|
||
```
|
||
````
|
||
|
||
```python {.run}
|
||
appendChild(pf.Para(pf.Strong(pf.Str("foo"))))
|
||
```
|
||
|
||
### Defining and running commands
|
||
|
||
Code blocks can be also saved and executed later. Defining is done using the
|
||
`define` attribute:
|
||
|
||
**Example:**
|
||
|
||
```python {define=commandname}
|
||
print("foo")
|
||
```
|
||
|
||
If you try to define the same command twice, you will get an error. To redefine
|
||
a command, use the `define` attribute instead of `redefine`.
|
||
|
||
### Running defined commands
|
||
|
||
There are multiple ways of running commands. There is the shorthand way:
|
||
|
||
[!commandname]{}
|
||
|
||
Or using the `c` attribute on a span or a div:
|
||
|
||
[Some content]{c=commandname}
|
||
|
||
:::{c=commandname}
|
||
Some content
|
||
:::
|
||
|
||
To access the content or attributes of the div or span the command has been
|
||
called on, the `element` variable is available, which contains the `panflute`
|
||
representation of the element.
|
||
|
||
**Example:**
|
||
|
||
```python {define=index}
|
||
appendChild(element.content[int(element.attributes["i"])])
|
||
```
|
||
|
||
[Pick the third element from this span]{c=index i=2}
|
||
|
||
### Direct metadata print
|
||
Metadata can be printed directly using a shorthand. The advantage of this is it
|
||
keeps the formatting from the metadatum's definition
|
||
|
||
```markdown {.group}
|
||
---
|
||
a:
|
||
b: some text with **strong**
|
||
---
|
||
[$a.b]{}
|
||
```
|
||
|
||
### Syntax highlighting
|
||
Formátítko uses [pygments](https://pygments.org/) to highlight syntax in code
|
||
blocks. To turn it off for a single block, don't specify a language or set the
|
||
`highlight` attribute to `False`. You can also set the metadatum `highlight` to
|
||
`false` in the FrontMatter to disable it in a given Group. To change the [highlighting
|
||
style](https://pygments.org/styles/), you have to set the `highlight-style`
|
||
metadatum in the **top-level document** this is to prevent the need for many
|
||
inline style definitions.
|
||
|
||
**Examples:**
|
||
```python
|
||
print("cool")
|
||
```
|
||
|
||
```zsh {highlight=False}
|
||
./formatitko.py README.md
|
||
```
|
||
|
||
### Language awareness
|
||
Formátítko is language aware, this means that the `language` metadatum is
|
||
somewhat special. When set using the front matter, it is also popped out to TeX
|
||
as a `\languagexx` macro. Currently supported values are `cs` and `en` for
|
||
internal uses but can be set to anything.
|
||
|
||
### NBSP
|
||
Formátítko automatically inserts no-break spaces according to its sorta smart
|
||
rules. (See the `whitespace.py` file for more info) These rules **depend on the
|
||
chosen language**. (`cs` has some additional rules)
|
||
|
||
To insert a literal no-break space, you can either insert the unicode no-break
|
||
space or use the html escape.
|
||
|
||
Enforcing a breakable space is not as painless, you should insert a zero-width
|
||
space beside the normal​ space.
|
||
|
||
### Smart quotes
|
||
Quotes get automatically converted to the slanted ones according to the current
|
||
language.
|
||
|
||
**Examples:**
|
||
|
||
```markdown {.group}
|
||
---
|
||
language: cs
|
||
---
|
||
"Uvozovky se v českém testu píší 'jinak' než v angličtině."
|
||
```
|
||
|
||
"In Czech texts, quotes are written 'differently' than in English"
|
||
|
||
### Math
|
||
Math blocks get automatically converted to HTML using $Ka\TeX$ and fall out
|
||
unchanged into TeX output.
|
||
|
||
To make KaTeX as consistent with TeX as possible, the `\begingroup` and
|
||
`\endgroup` that are produced by [Groups](#groups) are also emulated in the
|
||
KaTeX environment, so macro definitions should be isolated as you expect.
|
||
|
||
### Images
|
||
|
||
#### Figures
|
||
Pandoc's [implicit
|
||
figures](https://pandoc.org/MANUAL.html#extension-implicit_figures) are enabled,
|
||
so images which are alone in a paragraph are automatically converted to figures:
|
||
|
||
![A single pixel image, wow!](test/1px.png "This is the alt text shown to screen readers (it defaults to the caption)"){width=10em}
|
||
|
||
To prevent this, add a backslash at the end of the line with the image:
|
||
|
||
![A single pixel image, wow!](test/1px.png "This is the alt text shown to screen readers"){width=10em}\
|
||
|
||
#### Image gathering
|
||
Images are automatically searched for in the directory where each markdown file is
|
||
(including partials) and also in directories listed in the `--lookup-dirs`
|
||
command line parameter. After processing, they're all put into the folder
|
||
specified with `--public-dir`.
|
||
|
||
#### Image processing
|
||
Images are automatically processed so that they can be successfully used in both
|
||
output formats. This includes generating multiple sizes and providing a
|
||
[srcset](https://developer.mozilla.org/en-US/docs/Learn/HTML/Multimedia_and_embedding/Responsive_images).
|
||
|
||
To customize this, the `file-width`, `file-height`, `file-dpi`, `file-quality`
|
||
and `no-srcset` attributes are available. All but the last one should be
|
||
integers.
|
||
|
||
Keep in mind that the processing tries to be as lazy as possible, so it never
|
||
overwrites any files and if it finds the right format or resolution (only
|
||
judging by the filenames) in the lookup directories it will just copy that. This
|
||
means that any automatic attempts at conversion can be overridden by converting
|
||
the file yourself, naming it accordingly and placing it either in the public or
|
||
one of the lookup directories.
|
||
|
||
## Working with the produced output
|
||
|
||
### HTML
|
||
The HTML should be almost usable as-is. The styles for synstax-highlighting are
|
||
added automatically. The styles for KaTeX however are not and should be added in
|
||
your `<head>`^[This is taken directly from [KaTeX's docs](https://katex.org/docs/browser.html)]:
|
||
|
||
```html
|
||
<link rel='stylesheet' href='https://cdn.jsdelivr.net/npm/katex@0.16.4/dist/katex.min.css' integrity='sha384-vKruj+a13U8yHIkAyGgK1J3ArTLzrFGBbBc0tDp4ad/EyewESeXE/Iv67Aj8gKZ0' crossorigin='anonymous'>
|
||
```
|
||
|
||
Also the output HTML is not intended as a standalone file but should be included
|
||
as part of a larger template. (That includes a doctype, other css, etc.)
|
||
|
||
### TeX
|
||
The TeX output is not usable as is. Many of the elements are just converted to
|
||
macros, which you have to define yourself. There is an example implementation in
|
||
`formatitko.tex`, which uses LuaTeX and the ucwmac package, but you should
|
||
customize it to your needs (and to the context in which the output is used).
|