diff --git a/README.md b/README.md index 9bdc0ff..ce594bd 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,9 @@ Inkscape are used for image processing. Nodejs is used for KaTeX. ## Usage ``` -usage: formatitko.py [-h] [-l IMG_LOOKUP_DIRS [IMG_LOOKUP_DIRS ...]] [-p IMG_PUBLIC_DIR] [-i IMG_WEB_PATH] [-w OUTPUT_HTML] [-t OUTPUT_TEX] input_filename +usage: formatitko [-h] [-l IMG_LOOKUP_DIRS [IMG_LOOKUP_DIRS ...]] [-p IMG_PUBLIC_DIR] [-c IMG_CACHE_DIR] [-i IMG_WEB_PATH] [-w OUTPUT_HTML] [-t OUTPUT_TEX] [-m OUTPUT_MD] + [-j OUTPUT_JSON] [--katex-server] [-k KATEX_SOCKET] [--debug] + input_filename positional arguments: input_filename The markdown file to process. @@ -24,16 +26,27 @@ positional arguments: options: -h, --help show this help message and exit -l IMG_LOOKUP_DIRS [IMG_LOOKUP_DIRS ...], --img-lookup-dirs IMG_LOOKUP_DIRS [IMG_LOOKUP_DIRS ...] - Image lookup directories. When processing images, the program will try to find the image in them first. Always looks for images in the same folder as the markdown - file. (default: []) + Image lookup directories. When processing images, the program will try to find the image in them first. Always looks for images in the same folder + as the markdown file. (default: []) -p IMG_PUBLIC_DIR, --img-public-dir IMG_PUBLIC_DIR - Directory to put processed images into. The program will not overwrite existing images. (default: public) + Directory to put processed images into. The program will overwrite images, whose dependencies are newer. (default: public) + -c IMG_CACHE_DIR, --img-cache-dir IMG_CACHE_DIR + Directory to cache processed images and intermediate products. The program will overwrite files, whose dependencies are newer. (default: cache) -i IMG_WEB_PATH, --img-web-path IMG_WEB_PATH Path where the processed images are available on the website. (default: /) -w OUTPUT_HTML, --output-html OUTPUT_HTML - The HTML file (for Web) to write into. (default: output.html) + The HTML file (for Web) to write into. (default: None) -t OUTPUT_TEX, --output-tex OUTPUT_TEX - The TEX file to write into. (default: output.tex) + The TEX file to write into. (default: None) + -m OUTPUT_MD, --output-md OUTPUT_MD + The Markdown file to write into. (Uses pandoc to generate markdown) (default: None) + -j OUTPUT_JSON, --output-json OUTPUT_JSON + The JSON file to dump the pandoc-compatible AST into. (default: None) + --katex-server Starts a KaTeX server and prints the socket filename onto stdout. Useful for running formatitko many times without starting the KaTeX server each + time. (default: False) + -k KATEX_SOCKET, --katex-socket KATEX_SOCKET + The KaTeX server socket filename obtained by running with `--katex-server`. (default: None) + --debug ``` ## Format @@ -69,12 +82,28 @@ There are two ways of including files. #### Importing The first is importing, which only takes the state (defined commands, metadata, etc.) from the file and any content is omitted. This is useful for creating -libraries of commands. The syntax is as follows: +libraries of commands. -[#test/empty.md]{} +There are three types of imports: -The curly braces are required for pandoc to parse the import properly and should -be left empty. +##### Python Module (the default) +```markdown +[#ksp_formatitko as ksp]{} +``` +or +```markdown +[#ksp_formatitko]{} +``` +with an optional `type=module` in the curly brackets, tries to import a python +module as a set of formatitko commands. See below for more details about +commands. + +##### JSON Metadata +[#test/test.json]{type=metadata key=orgs} + +This will import metadata from a JSON file. THe optional `key` argument sets the +key under which the whole JSON file will be placed. Dictionaries are merged, +others overwritten. #### Partials Partials are the very opposite of imports, they have their own context, which @@ -95,12 +124,19 @@ partial to `tex` or `html`. ### Groups Groups are pieces of markdown with their own sandboxed context, in other words, -inline partials. They function exactly the same as partials, namely can have -their own front matter. +inline partials. Syntax-wise they are pandoc Divs with the `.group` class. All +attributes of the Div will be passed down as metadata to the group. + +::: {.group lang=cs} +OOOoo český mód +::: + +If you want to have more fancy metadata, that can only be specified in a front +matter, you can use the following syntax: ```markdown {.group} --- -language: cs +lang: cs --- OOOoo český mód ``` @@ -114,6 +150,9 @@ fmt.Pritln("owo") ``` ```` +Note however, that when this syntax is used, pandoc is executed for each of +these blocks which could get slow. Using divs is preferred. + Groups and partials are also enclosed in `\begingroup` and `\endgroup` in the output TeX. @@ -138,15 +177,38 @@ pandoc feature.] ### Running python code Formátítko allows you to run Python code directly from your MD file. Any -`python` code block with the class `run` will be executed: +`python` code block with the class `run` will be executed. -#### Context +#### Command environment -You can access the current context using the `ctx` variable. The context +The commands will be executed as functions with the following signature: +```python +def command(element: Command, context: Context) -> list[Element]: +``` +some global variables may be available, and are defined in `command_env.py`: +```python +import panflute as pf +import formatitko.elements as fe +from formatitko.util import import_md_list +from formatitko.util import parse_string + +from formatitko.context import Context +from formatitko.command import Command +from panflute import Element +``` +##### `element` parameter + +The `element` parameter holds the element the command is currently being executed +on. In the case of running python blocks directly, it is probably not +interesting but will get interesting later. + +##### `context` parameter + +You can access the current context using the `context` parameter. The context provides read/write access to the FrontMatter metadata. The context has the following methods: -`ctx.get_metadata(key: str, simple: bool=True, immediate: bool=False)` +`context.get_metadata(key: str, simple: bool=True, immediate: bool=False)` - `key`: The key of the metadatum you want to get. Separate child keys with dots: `ctx.get_metadata("flags.foo")` @@ -156,13 +218,13 @@ following methods: - `immediate`: Only get metadatum from the current context, not from its parents. -`ctx.set_metadata(key: str, value)` +`context.set_metadata(key: str, value)` - `key`: The key of the metadatum you want to get. Separate child keys with dots: `ctx.get_metadata("flags.foo")` - `value`: Any value you want to assign to the metadatum -`ctx.unset_metadata(key: str)` +`context.unset_metadata(key: str)` Delete the metadatum in the current context and allow it to inherit the value from the parent context. @@ -172,26 +234,31 @@ from the parent context. Helper functions for flags exist which work the same as for metadata: -`ctx.is_flag_set(flag: str) -> bool` +`context.is_flag_set(flag: str) -> bool` -`ctx.set_flag(flag: str, val: bool)` +`context.set_flag(flag: str, val: bool)` -`ctx.unset_flag(flag: str)` +`context.unset_flag(flag: str)` -#### Writing output +There are also other useful functions, which you can see for yourself in +`context.py`. -There are two modes of writing output, plaintext and element-based. +> **WARNING**: Writing to metadata should **only** be done **at the beginning** +> of the document or a group (before any printable content). Writing to metadata +> in other places in the document might cause undefined behaviour (mostly some +> elements might behave as if the metadata was set elsewhere). -Plaintext mode uses the `print(text: str)` and `println(text: str)` functions, -that append text to a buffer which is then interpreted as markdown input. +##### Return value +The function **must** return a list of valid Elements. This list may be empty. +These elements will be placed in the document in the location where the command +was invoked. -Element-based mode uses the `appendChild(element: pf.Element)` and -`appendChildren(*elements: List[pf.Element])` functions which allow you to -append `panflute` elements to a list which is then again interpreted as input. -The `panflute` library is available as `pf`. +The `parse_string` function might be useful, it turns a simple string into a +list of panflute's `Str`s and `Space`s (without any formatting). If you want to +use markdown in your function output, you have to convert it yourself using +`import_md` but beware this calls pandoc, is potentially slow and is +discouraged. -When one of these functions is called, the mode is set and functions from the -other mode cannot be called within the same block of code. **Examples:** @@ -200,14 +267,15 @@ other mode cannot be called within the same block of code. title: Foo --- ```python {.run} -println("*wooo*") -println() -println("The title of this file is: " + ctx.get_metadata("title")) +return [ + pf.Para(pf.Emph(pf.Str("wooo"))), + pf.Para(*parse_string("The title of this file is: " + context.get_metadata("title"))) +] ``` ```` ```python {.run} -appendChild(pf.Para(pf.Strong(pf.Str("foo")))) +return [pf.Strong(*parse_string("Hello world!"))] ``` ### Defining and running commands @@ -218,7 +286,7 @@ Code blocks can be also saved and executed later. Defining is done using the **Example:** ```python {define=commandname} -print("foo") +return [pf.Str("foo")] ``` If you try to define the same command twice, you will get an error. To redefine @@ -230,7 +298,7 @@ There are multiple ways of running commands. There is the shorthand way: [!commandname]{} -Or using the `c` attribute on a span or a div: +Or using the `c` attribute on a span or a div (new: or a codeblock!): [Some content]{c=commandname} @@ -238,6 +306,16 @@ Or using the `c` attribute on a span or a div: Some content ::: +```python {define=bash} +import subprocess +c = subprocess.run(["bash", "-c", element.text], stdout=subprocess.PIPE, check=True, encoding="utf-8") +return [pf.Para(pf.Str(c.stdout))] +``` + +```bash {c=bash} +cat /etc/hostname +``` + To access the content or attributes of the div or span the command has been called on, the `element` variable is available, which contains the `panflute` representation of the element. @@ -245,7 +323,7 @@ representation of the element. **Example:** ```python {define=index} -appendChild(element.content[int(element.attributes["i"])]) +return [element.content[int(element.attributes["i"])]] ``` [Pick the third element from this span]{c=index i=2} @@ -268,23 +346,24 @@ blocks. To turn it off for a single block, don't specify a language or set the `highlight` attribute to `False`. You can also set the metadatum `highlight` to `false` in the FrontMatter to disable it in a given Group. To change the [highlighting style](https://pygments.org/styles/), you have to set the `highlight-style` -metadatum in the **top-level document** this is to prevent the need for many -inline style definitions. +metadatum or the `style` attribute directly on the element. **Examples:** ```python print("cool") ``` +```python {style=manni} +print("freezing") +``` + ```zsh {highlight=False} ./formatitko.py README.md ``` ### Language awareness -Formátítko is language aware, this means that the `language` metadatum is -somewhat special. When set using the front matter, it is also popped out to TeX -as a `\languagexx` macro. Currently supported values are `cs` and `en` for -internal uses but can be set to anything. +Formátítko is language aware, this means that the `lang` metadatum is +somewhat special. (It is also special for pandoc) ### NBSP Formátítko automatically inserts no-break spaces according to its sorta smart @@ -303,12 +382,9 @@ language. **Examples:** -```markdown {.group} ---- -language: cs ---- +::: {.group lang=cs} "Uvozovky se v českém testu píší 'jinak' než v angličtině." -``` +::: "In Czech texts, quotes are written 'differently' than in English" @@ -339,6 +415,9 @@ Images are automatically searched for in the directory where each markdown file command line parameter. After processing, they're all put into the folder specified with `--public-dir`. +Formátítko also does dependency management, which means that all images will be +regenerated only when their dependencies are newer. + #### Image processing Images are automatically processed so that they can be successfully used in both output formats. This includes generating multiple sizes and providing a @@ -348,12 +427,22 @@ To customize this, the `file-width`, `file-height`, `file-dpi`, `file-quality` and `no-srcset` attributes are available. All but the last one should be integers. -Keep in mind that the processing tries to be as lazy as possible, so it never -overwrites any files and if it finds the right format or resolution (only -judging by the filenames) in the lookup directories it will just copy that. This -means that any automatic attempts at conversion can be overridden by converting -the file yourself, naming it accordingly and placing it either in the public or -one of the lookup directories. +Processing also includes Asymptote images -- you can simply include an asymptote +program as an image and formátítko handles the rest for you. + +#### Content headers and footers + +If you want formatitko to generate fully formed html files for you, you might +want to add a HTML partial with the starting tags and `
`. This would +normally not work, because the entire document is wrapped with `