On this page:
3.1 Tag URIs
mint-tag-uri
tag-uri->string
append-specific
tag=?
tag-entity-date?
tag-specific-string?
tag-uri?
3.2 Persons
person
person->xexpr
person?
3.3 Date and time information
infer-moment
moment->string
3.4 Enclosures and MIME types
enclosure
file->enclosure
mime-types-by-ext
path/  string->mime-type
3.5 Domains, URLs and email addresses
dns-domain?
valid-url-string?
email-address?
validate-email-address
3.6 Language codes
system-language
iso-639-language-code?
language-codes
8.15

3 Feed Constructs🔗

 (require splitflap/constructs) package: splitflap-lib

The format of feeds is specified by the Atom 1.0 and RSS 2.0 specifications (and, for all practical purposes, by Apple’s Podcast feed requirements in the case of podcasts). These in turn reference other RFCs to specify the format of many individual elements: timestamps, domain names, email addresses, people, identifiers, and languages.

Splitflap makes heavy use of custom contracts to ensure conformity to the spec at every level. In cases where it makes things simpler, Splitflap is a bit more strict than the actual spec.

The bindings documented in this section are provided by the main splitflap module as well as by splitflap/constructs.

3.1 Tag URIs🔗

Feeds, and items contained in feeds, require some globally unique identifier. Although any kind of reasonably unique identifier can be used in a feed, Splitflap takes the unreasonably opinionated stance of allowing only tag URIs, which are easy to create and read, and which can remain stable even if the resource’s URL changes.

A tag URI is an identifier of the form tag:authority,date:specific. The authority is a domain name (or email address) held by you as of date; together, the authority and the date form a unique tagging entity, which acts kind of like a namespace. The specific is a string uniquely identifying a particular resource within the tagging entity.

The tag URI scheme is formalized in RFC 4151.

procedure

(mint-tag-uri authority date specific)  tag-uri?

  authority : (or/c dns-domain? email-address?)
  date : tag-entity-date?
  specific : tag-specific-string?
Returns a tag URI struct for use as a unique identifier in a feed-item, feed, episode or podcast.

The date must be any date on which you had ownership or assignment of the domain or email address at 00:00 UTC (the start of the day). (See tag-entity-date?.)

The specific is a string that must be reliably and permanently unique within the set of things that your feed is serving. See tag-specific-string? for information about what characters are allowed here.

Examples:
> (mint-tag-uri "rclib.example.com" "2012-04-01" "Marian'sBlog")

#<tag-uri "tag:rclib.example.com,2012-04-01:Marian'sBlog">

> (mint-tag-uri "diveintomark.example.com" "2003" "3.2397")

#<tag-uri "tag:diveintomark.example.com,2003:3.2397">

procedure

(tag-uri->string tag)  non-empty-string?

  tag : tag-uri?
Converts a tag-uri into a string.

Examples:
> (define rclib-id (mint-tag-uri "rclib.example.com" "2012-04-01" "Marian'sBlog"))
> (tag-uri->string rclib-id)

"tag:rclib.example.com,2012-04-01:Marian'sBlog"

procedure

(append-specific tag suffix)  tag-uri?

  tag : tag-uri?
  suffix : tag-specific-string?
Returns a copy of tag with suffix appended to the “specific” (last) portion of the tag URI. This allows you to append to a feed’s tag URI to create unique identifiers for the items within that feed.

Examples:
> (define kottke-id (mint-tag-uri "kottke.example.com" "2005-12" "1"))
> kottke-id

#<tag-uri "tag:kottke.example.com,2005-12:1">

> (append-specific kottke-id "post-slug")

#<tag-uri "tag:kottke.example.com,2005-12:1.post-slug">

procedure

(tag=? tag1 tag2)  boolean?

  tag1 : tag-uri?
  tag2 : tag-uri?

The tag URI spec defines tags as being equal when their byte-strings are indistinguishable.

Returns #t if the tag-uri->string representation of tag1 and tag2 are equal?, #f otherwise.

procedure

(tag-entity-date? str)  boolean?

  str : string?
Returns #t if str is a string of the form "YYYY[-MM[-DD]]" that is, an acceptable date format for a tag URI according to RFC 4151.

Examples:
; Equivalent to January 1, 2012
> (tag-entity-date? "2012")

#t

; Equivalent to June 1, 2012
> (tag-entity-date? "2012-06")

#t

; take a guess on this one
> (tag-entity-date? "2012-10-21")

#t

> (tag-entity-date? "2012-1-1")

#f

procedure

(tag-specific-string? str)  boolean?

  str : string?
Returns #t if str is an acceptable string for the “specific” portion of a tag URI as specified in RFC 4151: a string comprised only of the characters in the range a–z, A–Z, 0–9 or in the set -._~!$&'()*+,;=:@/?.

Examples:
> (tag-specific-string? "abcdABCD01923")

#t

> (tag-specific-string? "-._~!$&'()*+,;=:@/?")

#t

> (tag-specific-string? "")

#t

> (tag-specific-string? "^")

#f

procedure

(tag-uri? v)  boolean?

  v : any/c
Returns #t when v is a tag-uri struct.

3.2 Persons🔗

procedure

(person name email [url])  person?

  name : non-empty-string?
  email : email-address?
  url : (or/c valid-url-string? #f) = #f
Returns a #<person> struct for use in a feed-item, feed, episode or podcast.

The Atom 1.0 and RSS 2.0 specs both have opinions about how people should be referenced in feeds. Atom requires only a name but also allows up to one email address and up to one URI. RSS requires one email address optionally followed by anything. So person requires both a name and an email, and the url is optional.

procedure

(person->xexpr p entity dialect)  txexpr?

  p : person?
  entity : symbol?
  dialect : (or/c 'rss 'atom 'itunes)
Converts p into a tagged X-expresssion using entity as enclosing tag name.

Examples:
> (define frank (person "Frankincense Pontipee" "frank@example.com"))
> (person->xexpr frank 'author 'atom)

'(author (name "Frankincense Pontipee") (email "frank@example.com"))

> (person->xexpr frank 'contributor 'atom)

'(contributor (name "Frankincense Pontipee") (email "frank@example.com"))

> (person->xexpr frank 'author 'rss)

'(author "frank@example.com (Frankincense Pontipee)")

> (person->xexpr frank 'itunes:owner 'itunes)

'(itunes:owner

  (itunes:name "Frankincense Pontipee")

  (itunes:email "frank@example.com"))

procedure

(person? v)  boolean?

  v : any/c
Returns #t when v is a person struct, #f otherwise.

3.3 Date and time information🔗

Feeds and feed items must be timestamped, and these values must include timezone information. Splitflap leans on the gregor library for this functionality — in particular, Moments and Time Zones and UTC Offsets and provides a couple of helper functions to make things a bit more ergonomic.

procedure

(infer-moment [str])  moment?

  str : string? = ""
Parses from str and returns a precise moment, inferring time information where ommitted and using current-timezone as the time zone for the moment.

If str is "", then the result of now/moment is returned. Otherwise str must be in the form "YYYY-MM-DD [hh:mm[:ss]]" or an exception is raised. If the seconds are ommitted, 00 is assumed, and if the hours and minutes are ommitted, 00:00:00 (the very start of the date) is assumed.

Examples:
> (infer-moment "2012-08-31")

#<moment 2012-08-31T00:00:00-05:00[America/Chicago]>

> (infer-moment "2012-08-31 13:34")

#<moment 2012-08-31T13:34:00-05:00[America/Chicago]>

> (infer-moment "2015-10-02 01:03:15")

#<moment 2015-10-02T01:03:15-05:00[America/Chicago]>

> (parameterize ([current-timezone -14400])
    (infer-moment "2015-10-02 01:03:15"))

#<moment 2015-10-02T01:03:15-04:00>

> (infer-moment "2012-09-14 12")

#<moment 2012-09-14T00:00:00-05:00[America/Chicago]>

> (infer-moment)

#<moment 2024-11-12T10:28:41.516962891-06:00[America/Chicago]>

Changed in version 1.2 of package splitflap-lib: Added no-argument form for current moment

procedure

(moment->string m dialect)  non-empty-string?

  m : moment?
  dialect : (or/c 'atom 'rss)
Converts m into a timestamp in the format required by the chosen dialect: RFC 3339 for Atom and RFC 822 for RSS.

Examples:
> (define m1 (infer-moment "2012-10-01"))
> (moment->string m1 'atom)

"2012-10-01T00:00:00-05:00"

> (moment->string m1 'rss)

"Mon, 1 Oct 2012 00:00:00 -0500"

> (parameterize ([current-timezone 0])
    (moment->string (infer-moment "2012-10-01") 'atom))

"2012-10-01T00:00:00Z"

3.4 Enclosures and MIME types🔗

An enclosure is an arbitrary resource related to a feed item that is potentially large in size and may require special handling. The canonical example is an MP3 file containing the audio for a podcast episode.

struct

(struct enclosure (url mime-type size))

  url : valid-url-string?
  mime-type : (or/c non-empty-string? #f)
  size : exact-nonnegative-integer?
A structure type for enclosures.

The mime-type, if provided and not set to #f, must be a useable MIME type, but is not currently validated to ensure this. The size should be the resource’s size in bytes.

This struct qualifies as food, so it can be converted to XML with express-xml.

procedure

(file->enclosure file base-url)  enclosure?

  file : path-string?
  base-url : valid-url-string?
Returns an enclosure for file, with a MIME type matching the file’s extension (if it can be determined), the URL set to file appended onto base-url, and the length set to the file’s actual length in bytes.

This procedure accesses the filesystem; if file does not exist, an exception is raised.

Examples:
; Make a temporary file
> (define audio-file (make-temporary-file "audio-~a.m4a"))
> (display-to-file (make-bytes 100 66) audio-file #:exists 'truncate)
; Pass the temp file to an enclosure
> (display
   (express-xml (file->enclosure audio-file "http://example.com") 'atom))

<link rel="enclosure" href="http://example.com/audio-17314289211731428921526.m4a" length="100" type="audio/mp4" />

; Cleanup
> (delete-file audio-file)

This table is built directly from the list maintained in the Apache SVN repository.

A promise that, when forced, yields a hash table mapping file extensions (in lowercase symbol form) to MIME types.

Example:
> (hash-ref (force mime-types-by-ext) 'epub)

"application/epub+zip"

procedure

(path/string->mime-type path)  (or/c string? #f)

  path : path-string?
Parses a file extension from path and returns its corresponding MIME type if one exists in mime-types-by-ext, #f otherwise. This function does not access the file system.

Examples:
> (path/string->mime-type ".m4a")

"audio/mp4"

> (path/string->mime-type "SIGIL_v1_21.wad")

"application/x-doom"

> (path/string->mime-type "mp3") ; No period, so no file extension!

#f

3.5 Domains, URLs and email addresses🔗

procedure

(dns-domain? v)  boolean?

  v : any/c
Returns #t if v is a string whose entire contents are a valid DNS domain according to RFC 1035:

Examples:
> (dns-domain? "a")

#t

> (dns-domain? "rclib.org")

#t

> (dns-domain? "a.b.c.d.e-f")

#t

> (dns-domain? "a.b1000.com")

#t

>
> (define longest-valid-label (make-string 62 #\a))
> (define longest-valid-domain
    (string-append longest-valid-label "." ; 63 bytes (including length header)
                   longest-valid-label "." ; 126
                   longest-valid-label "." ; 189
                   longest-valid-label "." ; 252
                   "aa"))
; 255 bytes
>
> (dns-domain? longest-valid-label)

#t

> (dns-domain? longest-valid-domain)

#t

> (dns-domain? (string-append longest-valid-label "a"))

#f

> (dns-domain? (string-append longest-valid-domain "a"))

#f

procedure

(valid-url-string? v)  boolean?

  v : any/c
Returns #t if v is a “valid URL” for use in feeds. For this library’s purposes, a valid URL is one which, when parsed with string->url, includes a valid scheme part (e.g. "http://"), and in which the host is a dns-domain? (and not, say, an IP address).

Examples:
> (valid-url-string? "http://rclib.example.com")

#t

> (valid-url-string? "telnet://rclib.example.com")

#t

> (valid-url-string? "gonzo://example.com") ; scheme need not be registered

#t

> (valid-url-string? "https://user:p@example.com:8080") ; includes user/password/port

#t

> (valid-url-string? "file://C:\\home\\user?q=me") ; Look, you do you

#t

>
; Valid URIs but not URLs:
> (valid-url-string? "news:comp.servers.unix") ; no host given, only path

#f

> (valid-url-string? "http://subdomain-.example.com") ; invalid label

#f

>
> ; Valid URLs but not allowed by this library for use in feeds
> (valid-url-string? "ldap://[2001:db8::7]/c=GB?objectClass?one") ; Host is not a DNS domain

#f

> (valid-url-string? "telnet://192.0.2.16:80/") ; ditto

#f

procedure

(email-address? v)  boolean?

  v : any/c
Returns #t if v is a valid email address according to what is essentially a common-sense subset of RFC 5322:

Examples:
> (email-address? "test-email.with+symbol@example.com")

#t

> (email-address? "#!$%&'*+-/=?^_{}|~@example.com")

#t

>
; See also dns-domain? which applies to everything after the @ sign
> (email-address? "email@123.123.123.123")

#f

> (email-address? "λ@example.com")

#f

procedure

(validate-email-address addr)  boolean?

  addr : string?
Returns addr if it is a valid email address (according to the same rules as for email-address?); otherwise, an exception is raised whose message explains the reason the address is invalid.

> (validate-email-address "marian@rclib.example.com")

"marian@rclib.example.com"

> (validate-email-address "@")

validate-email-address: domain is missing

  domain: ""

  in: "@"

> (validate-email-address "me@myself@example.com")

validate-email-address: address must not contain more than

one @ sign

  address: "me@myself@example.com"

  in: "me@myself@example.com"

> (validate-email-address ".marian@rclib.example.com")

validate-email-address: local part must not start with a

period

  local part: ".marian"

  in: ".marian@rclib.example.com"

> (validate-email-address "λ@example.com")

validate-email-address: local part may only include a–z,

A–Z, 0–9, or !#$%&'*+/=?^_‘{|}~-.

  local part: "λ"

  in: "λ@example.com"

> (validate-email-address "lambda@1.example.com")

validate-email-address: domain must be a valid RFC 1035

domain name

  domain: "1.example.com"

  in: "lambda@1.example.com"

3.6 Language codes🔗

A promise that, when forced, yields a two-letter symbol corresponding to the default language in use for the current user account/system. On Unix and Mac OS, the first two characters of the value returned by system-language+country are used. On Windows, the first two characters of the value in the registry key HKEY_CURRENT_USER\Control Panel\International\LocaleName are used. If the system language cannot be determined, an exception is raised the first time the promise is forced.

Example:

procedure

(iso-639-language-code? v)  boolean?

  v : any/c
Returns #t if v is a two-character lowercase symbol matching a two-letter ISO639-1 language code.

Examples:

A list of symbols that qualify as iso-639-language-code?.