Homeemacs › Raw strings

Raw strings

6 months ago, back when I was reading the C source of the Emacs reader I tried to implement raw strings in Emacs. This post was supposed to be written/published earlier but I had a lot of work in between, I’m still not very comfortable writing in English and I had a hosting problem. Anyway, here it is.

A raw string is just a special syntax for a string literal where the content is interpreted literally (especially the character \) i.e. nothing can be escaped or interpolated. Several programming languages handle them e.g.:

Python: r"aa\naa"    r"""aa\n"aa"""
Perl:   'aa\naa'     q{aa\n'aa}
C++11:  R"(aa\naa)"  R"foo(aa\n)aa)foo"

It’s very useful for regexes because every time you need to match a character that also happens to be a meta-character (like + or \) you have to escape it. And since the regex is written in a string literal you have to escape the escape character because they both use \ as the escape character. This process can be painful and error-prone. Google backslash hell or backslashitis for some examples.

Back to Emacs. I actually wrote a working proof of concept in the form of 2 patches to the reader function:

  • Triple-quoted strings (à la Python) (diff)
  • Custom-delimiter strings (à la Perl/sed) (diff)

The code is not very clean and may be buggy since most of it comes from the regular string syntax code but it works:

# Python
$ ./emacs -Q -batch --eval '(message #r"""ha"\nha""")'
ha"\nha

# Perl
$ ./emacs -Q -batch --eval '(message #r,ha"\nha,)'
ha"\nha
$ ./emacs -Q -batch --eval '(message #r~ha"\nha~)'
ha"\nha

Although the reader works, some minor parts of Emacs are broken in the presence of raw strings (sexp navigation, font-locking, C-x C-e, …). These other parts of the environment need to be aware of the new syntax and shouldn’t be too hard to fix.

At this point I posted my result to the emacs-devel mailing-list which led to an interesting discussion. There was no clear consensus but I think most people realized that raw strings are not a satisfying solution to the regex problem. Some would rather have a way to write custom syntax reader in Lisp which is nice but hard to implement. Others said you’re better off using rx.

rx is a macro that lets you write readable regex in the form of s-expressions:

(rx (+ "abc") "foo" (group (or "zob" "foo")))
=> "\\(?:abc\\)+foo\\(\\(?:foo\\|zob\\)\\)"

I personally think raw strings have their use outside of regexes and would be a nice addition to the Emacs Lisp language. As for the regex I now write mine with rx all the time. I just wish there was a built-in way to use rx in interactive search/replace functions. I will work on this eventually if someone hasn’t done this already.

That’s all for today.

1 Comments.[ Leave a comment ]

  1. I do not usually do testimonials but I had to share my experience thusfar with this particular products. I have a five thirty day period old baby at home (and two older children, ages six and 4). I never ever missing all the baby bodyweight right after the first two which has bothered me for years. I am 5’2" and was 168.two lbs once i started this product 4 times back (I used to be 125 lbs pre-children). My hubby had done some research on urge for food suppressants as I have had difficulty controlling my ingesting given that having children. I do do the job out (3-5 times each week usually) but wasn’t dropping pounds mainly because I simply was feeding on as well substantially (sometimes just because I had been bored or emotional). I just couldn’t manage myself and I’d get so frustrated and upset. So he buys this capsule and that i agree I’ll attempt it, why not right??

Leave a Comment

NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>