Regular Expressions - regex
#"pattern" is the literal representation of a regular expressions in Clojure, where
pattern is the regular expression.
(re-pattern pattern) will return the Clojure literal representation of a given regex pattern.
The regular expression syntax cheatsheet by Mozilla is an excellent reference for regular expression patterns.
Regular expressions overview
Double escaping not required
The Clojure syntax means you do not need to double escape special characters, eg.
\\, and keeps the patterns clean and simple to read. In other languages, backslashes intended for consumption by the regex compiler must be doubled.
(java.util.regex.Pattern/compile "\\d") ;;=> #"\d"
Host platform support
Clojure runs on the Java Virtual Machine and uses Java regular expressions.
Regular expressions in Clojure create a java.util.regex.Pattern type
(type #"pattern") ;;=> java.util.regex.Pattern
Regular expression option flags can make a pattern case-insensitive or enable multiline mode. Clojure's regex literals starting with (?
#"(?i)yo" matches the strings
Flags that can be used in Clojure regular-expression patterns, along with their long name and a description of what they do. See Java's documentation for the java.util.regex.Pattern class for more details.
The re-seq function is Clojure's regex workhorse. It returns a lazy seq of all matches in a string, which means it can be used to efficiently test whether a string matches or to find all matches in a string or a mapped file:
(re-seq #"\w+" "one-two/three") ;;=> ("one" "two" "three")
The preceding regular expression has no capturing groups, so each match in the returned seq is a string. A capturing group (subsegments that are accessible via the returned match object) in the regex causes each returned item to be a vector:
(re-seq #"\w*(\w)" "one-two/three") (["one" "e"] ["two" "o"] ["three" "e"])
Things to avoid
Java's regular-expression engine includes a Matcher object that mutates in a non-thread-safe way as it walks through a string finding matches. This object is exposed by Clojure via the re-matcher function and can be used as an argument to re-groups and the single-parameter form of re-find. Avoid these unless you're certain you know what you're doing. These dangerous functions are used internally by the implementations of some of the recommended functions described earlier, but in each case they're careful to disallow access to the Matcher object they use. Use matchers at your own risk, or better yet don't use them directly at all.