John Pearson

About

Publications

Research

Teaching

cv

Lab Website

Google Scholar

ResearchGate

GitHub

LinkedIn

Regular Expressions Cookbook by Jan Goyvaerts and Steven Levithan; O'Reilly Media

15 February 2014

Some people, when confronted with a problem, think, “I know, I’ll use regular expressions.” Now they have two problems.

— Old Regex Proverb

Disclaimer: I received a free review copy of this work through the O’Reilly Blogger Review Program.

Among casual programmers, regular expressions maintain something like the status of the old Celtic ogham: forbidding to the uninitiated, yet redolent of secrets and deep magic. Like Perl, which helped popularize them, they are among the clearest examples of “write-only” code — devastatingly effective for those who concoct them, unparseable by everyone else.

But it doesn’t have to be this way. Goyvaerts and Levithan, the minds behind regular-expressions.info, have written a regular expressions cookbook that not only lays out clear recipes for the most common regular expressions tasks, but serves as a concise (and precise) introduction to the features of modern regex across eight different languages.

That’s right. If you are coding in C#, VB.NET, Java, Javascript, PHP, Perl, Python, Ruby, or a language that shares these dialects, explicit code is available to help you validate common inputs, parse URLs and Markup languages, and search text. As other reviewers have noted, however, if you are using POSIX regex, you will have to muddle through, since this variant is not covered.

As someone who had written a handful of regular expressions with great difficulty, I got a lot of benefit from reading chapters 2 and 3, which cover basic regular expressions (a lot more than basic, in fact) and syntax for using regex in various programming languages. This alone is probably worth the price of admission. Some might prefer a gentler introduction, but no real background in regex theory is required (though some is helpful for understanding optimizations suggested in specific recipes), and the straight-ahead problem-solution format will likely be enough to get those with programming experience up to speed.

Chapters 4 through 9 consist of recipes for various common regex tasks, complete with useful variants, illustrative examples, and workarounds for regex dialects lacking some useful modern features (we’re looking at you, Javascript). As the authors note, this section is best used as reference or sipped occasionally, rather than read straight through.

The authors do have products to flog, but some of them are free, and all of that is harmlessly relegated to the first chapter.

I found the book tremendous both as a self-contained introduction and a cookbook, though those with very limited programming experience may opt for an introductory text first (the authors’ own introduction at regular-expressions.info is pretty decent). In my experience, careful reading of the early chapters (2 and 3) got me a long way toward being able to “sight-read” the later recipes, which I consider a success. At the very least, I have a solid reference to grab next time I have a problem requiring regular expressions, one that I hope will prevent me from creating that second problem.

Regular Expressions Cookbook, 2nd ed.