The reader is a fundamental part of a Lisp. It’s the function that
parse the representation of Lisp objects. Once a program is read, it
is fed to eval which evaluates it.
Let’s take a look at the Emacs reader, which lives in src/lread.c in
the read1 function. The first syntaxes are pretty standard but it gets
weirder.
If you see something wrong or know something I don’t, feel free to leave a comment or contact me.
Comment
There are only end-of-line comments in Elisp as far as I know.
; this is a comment
Number
Numbers can be read in binary, octal, decimal and hexadecimal.
#b101010 #o52 42 #x2a
Character
Returns the integer corresponding to a character in Emacs internal representation. Problematic characters can be escaped.
?c ?\n ;; newline ?\ ;; space
String
"foo bar" "foo\"bar"
Symbol
Any character that is printable and not space or any of these character can be used for a symbol:
"';()[]`,
Cons
A cons is a cell with a head (aka car) and a tail (aka cdr).
(a . b)
List
What would be Lisp without lists?
(a b c)
Vector
Vectors are fixed-size array with O(1) lookup.
[a b c]
Hashtable
Hashtable are fast (best case O(1) worst O(n)) lookup table. The data type has been around for some time but the reader syntax is a recent addition of Emacs 23.
The first element is the hash-table symbol. The rest is a
plist. There are more possible keys but most of them are optional.
#s(hash-table size 2 test equal data (k1 v1 k2 v2)) #s(hash-table data (k1 v1))
Char-table?
Some sort of vector…
#^[a b c] #^^[a b c]
String?
There are mention of boolean vectors in the code. Don’t know what it is for.
#&"abc"
Compiled functions
Read a vector and make it bytecode.
#[1 2 3]
String property list
String + metadata. Used for syntax highlighting.
#("aaa" 0 1 (k1 v1 k2 v2)
#("aaa")
Skip characters
From src/lread.c:
#@NUMBER is used to skip NUMBER following characters. That's used in .elc files to skip over doc strings and function definitions.
Unix shebang
Although it’s intended for skiping the first line of an Unix script, there is no check on the position.
(read "#!dfj 1 {z [}]\nafter-newline") => after-newline
Filename
Return the current file name as a string when reading a file, nil
otherwise.
#$
Quote
Encapsulate the following atom in a series of cons. Useful for macros.
(read "'a") => (quote a) (read "#'a") => (function a) (read "`(cake ,foo)") => (\` (cake (\, foo)))
Uninterned symbol
Every symbol read by the reader is passed to the intern function
which among other things stores it in the obarray variable. intern
is also callable from Lisp. Pass it a string and it will dynamically
create/get the symbol you asked for. Similarly, you can make an
uninterned symbol with the make-symbol function.
There is a special reader syntax to create a symbol without interning
it. To access it, you need a reference of some sort to the uninterned
symbol directly. This is sometimes used to hide things from the global
namespace (i.e. the obarray).
#:foo ;; is the uninterned symbol named foo.
Thanks to chad for the explanation.
Empty symbol
Returns –you guessed it– the empty symbol which fortunately has the same print syntax.
(read "##") => ##
Reusable numbered forms
A way to make self-referencing forms.
From the Emacs Lisp manual:
To represent shared or circular structures within a complex of Lisp
objects, you can use the reader constructs ‘#n=’ and ‘#n#’.
Use #n= before an object to label it for later reference;
subsequently, you can use #n# to refer the same object in another
place. Here, n is some integer. For example, here is how to make a
list in which the first element recurs as the third element:
(#1=(a) b #1#)
Thank you for the brilliant article. Will now more stop by. Greetings from Cologne