Regular Expressions: Difference between revisions
m Text replacement - "<(\/?)source" to "<$1syntaxhighlight" |
remove old Quanta reference |
||
Line 4: | Line 4: | ||
== Multiline Edits == | == Multiline Edits == | ||
Most graphical text editors or word processors have a single line input for the Search/Replace dialog. This is unsuitable for many text edit situations where the string you're looking to replace spans multiple lines. | Most graphical text editors or word processors have a single line input for the Search/Replace dialog. This is unsuitable for many text edit situations where the string you're looking to replace spans multiple lines. When using [[VSCode]], although the search and replace dialog is single line, it will dynamically expand to multiple lines if you COPY/PASTE multiple lines into the 'find' portion. You can also use the 'regex' option and line-terminator escapes in your pattern (e.g. "this \n will \n search across \n multiple lines"). | ||
== Single Line == | == Single Line == |
Revision as of 11:49, 19 September 2025
Regex is short for Regular Expression and is a syntax that allows for powerful pattern matching.
One important use of regular expressions is in multi-line, multi-file editing. For example, let's say you have 10,000 files and you want to edit similar (but not exact) occurrences of strings within those files. Regular expressions could help you isolate the target strings, and with precision, edit just the parts you need to edit while retaining the parts you need to keep.
Multiline Edits[edit]
Most graphical text editors or word processors have a single line input for the Search/Replace dialog. This is unsuitable for many text edit situations where the string you're looking to replace spans multiple lines. When using VSCode, although the search and replace dialog is single line, it will dynamically expand to multiple lines if you COPY/PASTE multiple lines into the 'find' portion. You can also use the 'regex' option and line-terminator escapes in your pattern (e.g. "this \n will \n search across \n multiple lines").
Single Line[edit]
Many editors and utilities are line based so even though they support a regular expression syntax for pattern matching, it is only good if the target does not span more than a single line. This is typically very problematic for code, XML, or HTML content where content is almost always in multiline "blocks" like function definitions, nodes, or paragraphs.
Background[edit]
Most implementations that we are concerned with utilize the PCRE Perl Compatible Regular Expression library.
The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX regular expression API. The PCRE library is free, even for building commercial software.
PCRE was originally written for the Exim MTA, but is now used by many high-profile open source projects, including Apache, PHP, KDE, Postfix, Analog, and Nmap
Should I use perl, bash, php, awk, sed, ...?[edit]
PHP has rich regular expression support. Perl obviously does too. So when you're at the command line with BASH, what's the best way to quickly search some content for a pattern using a rich regular expression? It can be hard to use bash because of all the quoting and interpolation. But, let's look at a couple examples of searching a PHP configuration file for variable assignments. Using perl, it's easy to print out only the parenthetical sub-expression
perl -ne 'print $1 if /\$wgDBuser.*"(.*)"/' ./LocalSettings.php
Using grep, you have \K for variable length look-behind but it may not be available on older systems. Thus, you may need to use cut
grep -Po '(?<=\$wgDBuser).*"(.*)"' ./LocalSettings.php | cut -d \" -f 2
Resources[edit]
- Regex in Python (and Ansible since it's written in Python)
- https://docs.python.org/2.7/library/re.html
- Regex in Javascript
- http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:RegExp
- Regex in PHP
- http://us3.php.net/manual/en/ref.pcre.php
- Limitations and Manual
- http://www.pcre.org/pcre.txt
- Regex on Windows
- http://weitz.de/regex-coach/
- RegExr.com for live testing of regex (uses GitHub login to save)