Custom inlined CSS in Org-mode HTML export

When you export an Org-mode document to HTML the default CSS style is:

  • inlined in the file, which is rather handy: a single file for you whole document.
  • not exactly pretty, but you can change this.

So I naturally look at the Org-mode documentation on customizing the CSS only to find that the simplest and recommended way of doing it is to add a special keywords at the top of your document:

#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="style1.css" />

Which means that I won’t have a single file anymore. Plus I have to put these each time I want a custom CSS. Not cool.

What I want is to change the default inlined style for every document I export to HTML. And also a nicer way to set a new style for a single document.

The documentation does mention the org-html-style-default and org-html-head-include-default-style variables, let’s have a look at org-mode/lisp/ox-html.el

Note: I’m using a recent version of Org-mode which has a new parse/export system. If you’re using a version ≥8 you should be fine.

The docstring of org-html-style-default reads:

The default style specification for exported HTML files. You can use `org-html-head’ and `org-html-head-extra’ to add to this style. If you don’t want to include this default style, customize `org-html-head-include-default-style’.

We just have to set org-html-head-include-default-style to nil and place our own style in org-html-head. I’ve added a function to the org-export-before-processing-hook to setup these variables before exporting.

Here’s what I put in my init file:

(defun my-org-inline-css-hook (exporter)
  "Insert custom inline css"
  (when (eq exporter 'html)
    (let* ((dir (ignore-errors (file-name-directory (buffer-file-name))))
           (path (concat dir "style.css"))
           (homestyle (or (null dir) (null (file-exists-p path))))
           (final (if homestyle "~/.emacs.d/org-style.css" path)))
      (setq org-html-head-include-default-style nil)
      (setq org-html-head (concat
                           "<style type=\"text/css\">\n"
                           "<!--/*--><![CDATA[/*><!--*/\n"
                           (with-temp-buffer
                             (insert-file-contents final)
                             (buffer-string))
                           "/*]]>*/-->\n"
                           "</style>\n")))))

(eval-after-load 'ox
  '(progn
     (add-hook 'org-export-before-processing-hook 'my-org-inline-css-hook)))

I’ve settled on inlining a default style in .emacs.d/org-style.css or the content of a file style.css if it exists in the same directory as my document.

In hindsight I think I should make the 2 variables buffer-local before setting them but it works well like this.

Raw strings

6 months ago, back when I was reading the C source of the Emacs reader I tried to implement raw strings in Emacs. This post was supposed to be written/published earlier but I had a lot of work in between, I’m still not very comfortable writing in English and I had a hosting problem. Anyway, here it is.

A raw string is just a special syntax for a string literal where the content is interpreted literally (especially the character \) i.e. nothing can be escaped or interpolated. Several programming languages handle them e.g.:

Python: r"aa\naa"    r"""aa\n"aa"""
Perl:   'aa\naa'     q{aa\n'aa}
C++11:  R"(aa\naa)"  R"foo(aa\n)aa)foo"

It’s very useful for regexes because every time you need to match a character that also happens to be a meta-character (like + or \) you have to escape it. And since the regex is written in a string literal you have to escape the escape character because they both use \ as the escape character. This process can be painful and error-prone. Google backslash hell or backslashitis for some examples.

Back to Emacs. I actually wrote a working proof of concept in the form of 2 patches to the reader function:

  • Triple-quoted strings (à la Python) (diff)
  • Custom-delimiter strings (à la Perl/sed) (diff)

The code is not very clean and may be buggy since most of it comes from the regular string syntax code but it works:

# Python
$ ./emacs -Q -batch --eval '(message #r"""ha"\nha""")'
ha"\nha

# Perl
$ ./emacs -Q -batch --eval '(message #r,ha"\nha,)'
ha"\nha
$ ./emacs -Q -batch --eval '(message #r~ha"\nha~)'
ha"\nha

Although the reader works, some minor parts of Emacs are broken in the presence of raw strings (sexp navigation, font-locking, C-x C-e, …). These other parts of the environment need to be aware of the new syntax and shouldn’t be too hard to fix.

At this point I posted my result to the emacs-devel mailing-list which led to an interesting discussion. There was no clear consensus but I think most people realized that raw strings are not a satisfying solution to the regex problem. Some would rather have a way to write custom syntax reader in Lisp which is nice but hard to implement. Others said you’re better off using rx.

rx is a macro that lets you write readable regex in the form of s-expressions:

(rx (+ "abc") "foo" (group (or "zob" "foo")))
=> "\\(?:abc\\)+foo\\(\\(?:foo\\|zob\\)\\)"

I personally think raw strings have their use outside of regexes and would be a nice addition to the Emacs Lisp language. As for the regex I now write mine with rx all the time. I just wish there was a built-in way to use rx in interactive search/replace functions. I will work on this eventually if someone hasn’t done this already.

That’s all for today.

Compilation in Emacs

When someone new to Emacs asks how can he get the line numbers in a buffer he usually want them to know the position of a compilation error. This is not the right way to do it.

Emacs has a compilation mode builtin defined in compile.el. It uses a very generic text-based approach.

This is a guide to new Emacs users. The more adventurous/experienced ones can scroll to the end for a tutorial on how to add new rules to the error parser.

Usage

  • Call M-x compile.
  • It prompts you for a command to run the compilation. Type your command and RET.
  • Emacs runs the compilation command, parses its output and splits the frame to pop-up the result buffer.

If there are errors, there are several ways to get to them.

If you press C-x ` you will jump to the next error in the current buffer. You don’t have to lookup the line or anything.

Alternatively, if you switch to the result buffer (called *compilation*), you will see you can hit RET on an error line to jump to it.

Once you have run M-x compile at least once, you can use M-x recompile to compile with the same command. You can call compile again if you want to change the command.

Also, like most prompts in Emacs you can cycle through your input history with M-n and M-p.

Useful keybindings

While in the compilation buffer:

  • g – recompile
  • q – kill the window
  • TAB or M-n – move cursor the next error
  • S-TAB (shift tab or backtab) or M-p – move cursor to the previous error.
  • C-c C-k – kill the compilation process

Common customization

If you are used to a single keypress to compile you can bind recompile to the F9 key for example.

;; eval this or place it your .emacs    
(global-set-key (kbd "<f9>") 'recompile)

How does it work?

The variable compilation-error-regexp-alist is a list of rules to parse error messages (to extract the filename, line, etc).

Each element of this list is either a symbol or another list.

  • When it’s a symbol, it’s looked up in compilation-error-regexp-alist-alist which contain the actual rule. The symbol name is generally used when the rule matches the output of a specific program.
  • When it’s a list, it’s the rule. Use this if your rule is generic or if you’re lazy.

A basic rule is a list of the form:

(REGEXP FILE [LINE COLUMN])
  • REGEXP is a regular expression string which is matched against the output.
  • the remaining elements are integers which corresponds the capture group number in the regex. You must at least provide the filename, the rest is optional.

There are other useful fields you can provide.

You can look up the variable documentation for more information (C-h v compilation-error-regexp-alist RET).

Add a new rule

If you compile something which Emacs can’t parse properly you can add your own rule. Let’s say an error in the compile command output looks like this:

Error in the file foobar.c at the line 42

We can extract the file and the line number with the following regex:

"^Error in the file \\(.+\\) at the line \\([0-9]+\\)$"

The filename is the first capture group and the line number is the second. Thus, the final rule is:

("^Error in the file \\(.+\\) at the line \\([0-9]+\\)$" 1 2)

To use this new rule you can eval this in the *scratch* buffer or in M-:

(add-to-list
  'compilation-error-regexp-alist
  '("^Error in the file \\(.+\\) at the line \\([0-9]+\\)$" 1 2))

Write it to your .emacs file to make it permanent.

If you add a rule for a tool that could be useful for other people don’t hesitate to send it to the Emacs mailing-list!

A look at Emacs Lisp reader

The reader is a fundamental part of a Lisp. It’s the function that parse the representation of Lisp objects. Once a program is read, it is fed to eval which evaluates it.

Let’s take a look at the Emacs reader, which lives in src/lread.c in the read1 function. The first syntaxes are pretty standard but it gets weirder.

If you see something wrong or know something I don’t, feel free to leave a comment or contact me.

Comment

There are only end-of-line comments in Elisp as far as I know.

; this is a comment

Number

Numbers can be read in binary, octal, decimal and hexadecimal.

#b101010
#o52
42
#x2a

Character

Returns the integer corresponding to a character in Emacs internal representation. Problematic characters can be escaped.

?c
?\n ;; newline
?\  ;; space

String

"foo bar"
"foo\"bar"

Symbol

Any character that is printable and not space or any of these character can be used for a symbol:

"';()[]`,

Cons

A cons is a cell with a head (aka car) and a tail (aka cdr).

(a . b)

List

What would be Lisp without lists?

(a b c)

Vector

Vectors are fixed-size array with O(1) lookup.

[a b c]

Hashtable

Hashtable are fast (best case O(1) worst O(n)) lookup table. The data type has been around for some time but the reader syntax is a recent addition of Emacs 23.

The first element is the hash-table symbol. The rest is a plist. There are more possible keys but most of them are optional.

#s(hash-table size 2 test equal data (k1 v1 k2 v2))
#s(hash-table data (k1 v1))

Char-table?

Some sort of vector…

#^[a b c]
#^^[a b c]

String?

There are mention of boolean vectors in the code. Don’t know what it is for.

#&"abc"

Compiled functions

Read a vector and make it bytecode.

#[1 2 3]

String property list

String + metadata. Used for syntax highlighting.

#("aaa" 0 1 (k1 v1 k2 v2)
#("aaa")

Skip characters

From src/lread.c:

#@NUMBER is used to skip NUMBER following characters.
That's used in .elc files to skip over doc strings
and function definitions.

Unix shebang

Although it’s intended for skiping the first line of an Unix script, there is no check on the position.

(read "#!dfj 1 {z [}]\nafter-newline") => after-newline

Filename

Return the current file name as a string when reading a file, nil otherwise.

#$

Quote

Encapsulate the following atom in a series of cons. Useful for macros.

(read "'a")  => (quote a)
(read "#'a") => (function a)
(read "`(cake ,foo)") => (\` (cake (\, foo)))

Uninterned symbol

Every symbol read by the reader is passed to the intern function which among other things stores it in the obarray variable. intern is also callable from Lisp. Pass it a string and it will dynamically create/get the symbol you asked for. Similarly, you can make an uninterned symbol with the make-symbol function.

There is a special reader syntax to create a symbol without interning it. To access it, you need a reference of some sort to the uninterned symbol directly. This is sometimes used to hide things from the global namespace (i.e. the obarray).

#:foo ;; is the uninterned symbol named foo.

Thanks to chad for the explanation.

Empty symbol

Returns –you guessed it– the empty symbol which fortunately has the same print syntax.

(read "##") => ##

Reusable numbered forms

A way to make self-referencing forms.

From the Emacs Lisp manual:

To represent shared or circular structures within a complex of Lisp
objects, you can use the reader constructs ‘#n=’ and ‘#n#’.

Use #n= before an object to label it for later reference;
subsequently, you can use #n# to refer the same object in another
place. Here, n is some integer. For example, here is how to make a
list in which the first element recurs as the third element:

     (#1=(a) b #1#)