joeldueckdotcom

I’m working on a Racket #lang called Punct that serves the same purpose as Pollen — a programming environment for published artifacts. Punct combines Markdown and free-form Racket code, producing a format-independent AST.

Why would I do this? I have enjoyed using Pollen very much. But after seven years of improving my Racket skills, my “Pollen projects” have been making less and less actual use of Pollen’s facilities and features. I’ve started to wish for a DSL that has only the pieces I need: something with fewer moving parts and a different approach to markup and rendering.

Punct is supposed to be a good fit for when you want very lightweight markup and for the language to handle paragraphs detection and footnotes for you, but you still want the ability to use functions as markup for things Markdown doesn’t provide.

I have designed this for my own use, perhaps only for a single project. I plan to make it publicly available, but not as a package on the package server, thus contributing to the Lisp Curse.

Differences from Pollen

New Features

Metadata block

Sources can optionally add metadata using key: value lines delimited by lines consisting only of consecutive hyphens:

  #lang punct
    
  ---
  title: Prepare to be amazed
  date: 2020-05-07
  ---

This is a syntactic convenience that comes with a few rules and limitations:

If you want to use non-string values, or the results of expressions, in your metadata, you can use the set-meta function anywhere in the document or in code contained in other modules. Within the document body you can also use the ? macro as shorthand for set-meta.

Integrating Markdown

Markdown itself is too limited, but it would be nice to be able to use it as a starting point and add “tag functions” where richer markup is needed.

The language uses the commonmark package for Markdown processing. This package is ideal since it implements a thoroughly specified standard, is fast, and parses to an AST rather than directly to HTML.

Here’s how Markdown and Racket are combined:

  1. The language parses the source file with a Pollen-like reader first. All the code is expanded and evaluated, and the results of all the expressions are converted to strings. X-expressions get special handling during this conversion so they can be reassembled later.
  2. After all expressions are evaluated, the document is concatenated into a string that is parsed by commonmark, producing a document struct.
  3. The document struct is then transformed into another AST in the form of an X-expression, using a custom renderer. During this process, any X-expressions from step 1 are recognized and reconstituted in place.
  4. This final AST becomes the doc provided by the source document.

Layering two independent syntaxes on top of each other like this is tricky. The hard part is handling tag functions that emit something other than a stringish value, such as an X-expression that itself contains text that should be parsed as Markdown. For example, there might be a footnote reference in part of the caption of a figure tag.

The solution is to “flat pack” X-expressions before the CommonMark parsing pass — that is, to transform them recursively into a flat strings delimited by HTML-style tags that preserve their attributes and elements. When parsing the combined string content, CommonMark will parse the delimiting tags as html and html-block values. Then, during the AST transformation (step 3 above) these values will be recognized and used to reassemble the original x-expressions in place, with their original string content replaced by the parsed CommonMark elements.

Name ideas

Example syntax

Racket code in Punct uses Scribble’s @ syntax, but with the “bullet” character (Unicode U+2022) as the control character rather than @.

This is an example of syntax only; the note, attrib and other functions shown are not actually provided by the language (yet).

  #lang punct "my-additional-tags.rkt"
    
  ---
  title: Prepare to be amazed
  date: 2020-05-07
  ---

  This is a paragraph. **Bold text**, etc. — you know the Markdown drill.

  The `my-additional-tags.rkt` above is an example of an optional module path 
  that will be `require`d into the current document for additional tag functions.

  > Famous quotation.
  >
  > •attrib{Surly Buster, [_Fight to Win_][ftw] (2008)}

  The above is an example of a Markdown blockquote containing a tag function which
  in turn contains Markdown.

  •note[#:date "2020-05-07" #:by "A Reader" #:bylink "foo@msn.com"]{

    This is a note added to the document[^1].

    •poem[#:title "Institutions"]{
      ‘Ləh’
    }

    [^1]: It can contain its own footnotes and link references.
  }

 [ftw]: https://surly.guy/fight-to-win/ 'Book website'

HTML Rendering

To try out the HTML renderer, run a Punct program (in DrRacket, for example), then in the REPL:

(require punct/render/html)
(doc->string doc)

Prior Art

How to Create a Pollen Markup Alternative in 61 Lines by Sage Gerard. Sage’s take was more about a text markup format that you could send through eval rather than creating a proper #lang where the sources would behave like first-class modules. Still a very useful experiment.