3 Feed Constructs
(require splitflap/constructs) | package: splitflap-lib |
The format of feeds is specified by the Atom 1.0 and RSS 2.0 specifications (and, for all practical purposes, by Apple’s Podcast feed requirements in the case of podcasts). These in turn reference other RFCs to specify the format of many individual elements: timestamps, domain names, email addresses, people, identifiers, and languages.
Splitflap makes heavy use of custom contracts to ensure conformity to the spec at every level. In cases where it makes things simpler, Splitflap is a bit more strict than the actual spec.
The bindings documented in this section are provided by the main splitflap module as well as by splitflap/constructs.
3.1 Tag URIs
Feeds, and items contained in feeds, require some globally unique identifier. Although any kind of reasonably unique identifier can be used in a feed, Splitflap takes the unreasonably opinionated stance of allowing only tag URIs, which are easy to create and read, and which can remain stable even if the resource’s URL changes.
A tag URI is an identifier of the form tag:‹authority›,‹date›:‹specific›. The ‹authority› is a domain name (or email address) held by you as of ‹date›; together, the authority and the date form a unique tagging entity, which acts kind of like a namespace. The ‹specific› is a string uniquely identifying a particular resource within the tagging entity.
The tag URI scheme is formalized in RFC 4151.
procedure
(mint-tag-uri authority date specific) → tag-uri?
authority : (or/c dns-domain? email-address?) date : tag-entity-date? specific : tag-specific-string?
The date must be any date on which you had ownership or assignment of the domain or email address at 00:00 UTC (the start of the day). (See tag-entity-date?.)
The specific is a string that must be reliably and permanently unique within the set of things that your feed is serving. See tag-specific-string? for information about what characters are allowed here.
> (mint-tag-uri "rclib.example.com" "2012-04-01" "Marian'sBlog") #<tag-uri "tag:rclib.example.com,2012-04-01:Marian'sBlog">
> (mint-tag-uri "diveintomark.example.com" "2003" "3.2397") #<tag-uri "tag:diveintomark.example.com,2003:3.2397">
procedure
(tag-uri->string tag) → non-empty-string?
tag : tag-uri?
> (define rclib-id (mint-tag-uri "rclib.example.com" "2012-04-01" "Marian'sBlog")) > (tag-uri->string rclib-id) "tag:rclib.example.com,2012-04-01:Marian'sBlog"
procedure
(append-specific tag suffix) → tag-uri?
tag : tag-uri? suffix : tag-specific-string?
> (define kottke-id (mint-tag-uri "kottke.example.com" "2005-12" "1")) > kottke-id #<tag-uri "tag:kottke.example.com,2005-12:1">
> (append-specific kottke-id "post-slug") #<tag-uri "tag:kottke.example.com,2005-12:1.post-slug">
The tag URI spec defines tags as being equal when their byte-strings are indistinguishable.
Returns #t if the tag-uri->string representation of tag1 and tag2 are equal?, #f otherwise.
procedure
(tag-entity-date? str) → boolean?
str : string?
; Equivalent to January 1, 2012 > (tag-entity-date? "2012") #t
; Equivalent to June 1, 2012 > (tag-entity-date? "2012-06") #t
; take a guess on this one > (tag-entity-date? "2012-10-21") #t
> (tag-entity-date? "2012-1-1") #f
procedure
(tag-specific-string? str) → boolean?
str : string?
> (tag-specific-string? "abcdABCD01923") #t
> (tag-specific-string? "-._~!$&'()*+,;=:@/?") #t
> (tag-specific-string? "") #t
> (tag-specific-string? "^") #f
3.2 Persons
procedure
name : non-empty-string? email : email-address? url : (or/c valid-url-string? #f) = #f
The Atom 1.0 and RSS 2.0 specs both have opinions about how people should be referenced in feeds. Atom requires only a name but also allows up to one email address and up to one URI. RSS requires one email address optionally followed by anything. So person requires both a name and an email, and the url is optional.
procedure
(person->xexpr p entity dialect) → txexpr?
p : person? entity : symbol? dialect : (or/c 'rss 'atom 'itunes)
> (define frank (person "Frankincense Pontipee" "frank@example.com")) > (person->xexpr frank 'author 'atom) '(author (name "Frankincense Pontipee") (email "frank@example.com"))
> (person->xexpr frank 'contributor 'atom) '(contributor (name "Frankincense Pontipee") (email "frank@example.com"))
> (person->xexpr frank 'author 'rss) '(author "frank@example.com (Frankincense Pontipee)")
> (person->xexpr frank 'itunes:owner 'itunes)
'(itunes:owner
(itunes:name "Frankincense Pontipee")
(itunes:email "frank@example.com"))
3.3 Date and time information
Feeds and feed items must be timestamped, and these values must include timezone information.
Splitflap leans on the gregor library for this functionality —
procedure
(infer-moment [str]) → moment?
str : string? = ""
If str is "", then the result of now/moment is returned. Otherwise str must be in the form "YYYY-MM-DD [hh:mm[:ss]]" or an exception is raised. If the seconds are ommitted, 00 is assumed, and if the hours and minutes are ommitted, 00:00:00 (the very start of the date) is assumed.
> (infer-moment "2012-08-31") #<moment 2012-08-31T00:00:00-05:00[America/Chicago]>
> (infer-moment "2012-08-31 13:34") #<moment 2012-08-31T13:34:00-05:00[America/Chicago]>
> (infer-moment "2015-10-02 01:03:15") #<moment 2015-10-02T01:03:15-05:00[America/Chicago]>
> (parameterize ([current-timezone -14400]) (infer-moment "2015-10-02 01:03:15")) #<moment 2015-10-02T01:03:15-04:00>
> (infer-moment "2012-09-14 12") #<moment 2012-09-14T00:00:00-05:00[America/Chicago]>
> (infer-moment) #<moment 2024-11-12T10:28:41.516962891-06:00[America/Chicago]>
Changed in version 1.2 of package splitflap-lib: Added no-argument form for current moment
procedure
(moment->string m dialect) → non-empty-string?
m : moment? dialect : (or/c 'atom 'rss)
> (define m1 (infer-moment "2012-10-01")) > (moment->string m1 'atom) "2012-10-01T00:00:00-05:00"
> (moment->string m1 'rss) "Mon, 1 Oct 2012 00:00:00 -0500"
> (parameterize ([current-timezone 0]) (moment->string (infer-moment "2012-10-01") 'atom)) "2012-10-01T00:00:00Z"
3.4 Enclosures and MIME types
An enclosure is an arbitrary resource related to a feed item that is potentially large in size and may require special handling. The canonical example is an MP3 file containing the audio for a podcast episode.
struct
url : valid-url-string? mime-type : (or/c non-empty-string? #f) size : exact-nonnegative-integer?
The mime-type, if provided and not set to #f, must be a useable MIME type, but is not currently validated to ensure this. The size should be the resource’s size in bytes.
This struct qualifies as food, so it can be converted to XML with express-xml.
procedure
(file->enclosure file base-url) → enclosure?
file : path-string? base-url : valid-url-string?
This procedure accesses the filesystem; if file does not exist, an exception is raised.
; Make a temporary file > (define audio-file (make-temporary-file "audio-~a.m4a")) > (display-to-file (make-bytes 100 66) audio-file #:exists 'truncate) ; Pass the temp file to an enclosure
> (display (express-xml (file->enclosure audio-file "http://example.com") 'atom)) <link rel="enclosure" href="http://example.com/audio-17314289211731428921526.m4a" length="100" type="audio/mp4" />
; Cleanup > (delete-file audio-file)
value
This table is built directly from the list maintained in the Apache SVN repository.
A promise that, when forced, yields a hash table mapping file extensions (in lowercase symbol form) to MIME types.
> (hash-ref (force mime-types-by-ext) 'epub) "application/epub+zip"
procedure
(path/string->mime-type path) → (or/c string? #f)
path : path-string?
> (path/string->mime-type ".m4a") "audio/mp4"
> (path/string->mime-type "SIGIL_v1_21.wad") "application/x-doom"
> (path/string->mime-type "mp3") ; No period, so no file extension! #f
3.5 Domains, URLs and email addresses
procedure
(dns-domain? v) → boolean?
v : any/c
Must contain one or more labels separated by .
Each label must consist of only the characters A–Z, a–z, 0–9, or -.
Labels may not start with a digit or a hyphen, and may not end in a hyphen.
No individual label may be longer than 63 bytes (including an extra byte for a length header), and the entire domain may not be longer than 255 bytes.
> (dns-domain? "a") #t
> (dns-domain? "rclib.org") #t
> (dns-domain? "a.b.c.d.e-f") #t
> (dns-domain? "a.b1000.com") #t
> > (define longest-valid-label (make-string 62 #\a))
> (define longest-valid-domain (string-append longest-valid-label "." ; 63 bytes (including length header) longest-valid-label "." ; 126 longest-valid-label "." ; 189 longest-valid-label "." ; 252 "aa")) ; 255 bytes > > (dns-domain? longest-valid-label) #t
> (dns-domain? longest-valid-domain) #t
> (dns-domain? (string-append longest-valid-label "a")) #f
> (dns-domain? (string-append longest-valid-domain "a")) #f
procedure
(valid-url-string? v) → boolean?
v : any/c
> (valid-url-string? "http://rclib.example.com") #t
> (valid-url-string? "telnet://rclib.example.com") #t
> (valid-url-string? "gonzo://example.com") ; scheme need not be registered #t
> (valid-url-string? "https://user:p@example.com:8080") ; includes user/password/port #t
> (valid-url-string? "file://C:\\home\\user?q=me") ; Look, you do you #t
> ; Valid URIs but not URLs: > (valid-url-string? "news:comp.servers.unix") ; no host given, only path #f
> (valid-url-string? "http://subdomain-.example.com") ; invalid label #f
> > ; Valid URLs but not allowed by this library for use in feeds > (valid-url-string? "ldap://[2001:db8::7]/c=GB?objectClass?one") ; Host is not a DNS domain #f
> (valid-url-string? "telnet://192.0.2.16:80/") ; ditto #f
procedure
(email-address? v) → boolean?
v : any/c
Must be in the format ‹local-part›@‹domain›
The ‹local-part› must be no longer than 65 bytes and only include a–z, A–Z, 0–9, or characters in the set !#$%&'*+/=?^_‘{|}~-..
The ‹domain› must be valid according to dns-domain?.
The entire email address must be no longer than 255 bytes.
> (email-address? "test-email.with+symbol@example.com") #t
> (email-address? "#!$%&'*+-/=?^_{}|~@example.com") #t
> ; See also dns-domain? which applies to everything after the @ sign > (email-address? "email@123.123.123.123") #f
> (email-address? "λ@example.com") #f
procedure
(validate-email-address addr) → boolean?
addr : string?
> (validate-email-address "marian@rclib.example.com") "marian@rclib.example.com"
> (validate-email-address "@") validate-email-address: domain is missing
domain: ""
in: "@"
> (validate-email-address "me@myself@example.com") validate-email-address: address must not contain more than
one @ sign
address: "me@myself@example.com"
in: "me@myself@example.com"
> (validate-email-address ".marian@rclib.example.com") validate-email-address: local part must not start with a
period
local part: ".marian"
in: ".marian@rclib.example.com"
> (validate-email-address "λ@example.com") validate-email-address: local part may only include a–z,
A–Z, 0–9, or !#$%&'*+/=?^_‘{|}~-.
local part: "λ"
in: "λ@example.com"
> (validate-email-address "lambda@1.example.com") validate-email-address: domain must be a valid RFC 1035
domain name
domain: "1.example.com"
in: "lambda@1.example.com"
3.6 Language codes
> (force system-language) 'en
procedure
v : any/c
> (iso-639-language-code? 'fr) #t
> (iso-639-language-code? 'FR) #f
value