243 lines
5.2 KiB
Plaintext
243 lines
5.2 KiB
Plaintext
|
|
|
|
<p>Raw link: <a href="https://www.youtube.com/watch?v=TxYGHjKBMUg">https://www.youtube.com/watch?v=TxYGHjKBMUg</a></p>
|
|
|
|
<p>In this video tutorial I show how to use regular expression syntax to
|
|
solve various practical problems in Emacs.</p>
|
|
|
|
<p>Knowledge of regexp notation is not a prerequisite to using Emacs
|
|
effectively. In fact, you can be very productive without knowing
|
|
anything about regular expressions. However, knowing those things
|
|
will certainly boost your productivity and make Emacs an even more
|
|
powerful tool at your hands.</p>
|
|
|
|
<p>See <a href="https://protesilaos.com/emacs/dotemacs">my dotemacs</a> for the
|
|
documentation and package declarations I provide.</p>
|
|
|
|
<hr />
|
|
|
|
<p>This is the full text of my presentation, which was done using
|
|
<code>org-mode</code> (check my dotemacs for presentations with Org).</p>
|
|
|
|
<pre><code>* Emacs regular expressions in practice
|
|
|
|
Emacs has a few ways to operate on regexp matches, such as:
|
|
|
|
+ =isearch=
|
|
+ =query-replace=
|
|
+ =keep-lines=
|
|
+ =flush-lines=
|
|
|
|
To make our life easier, we can practice with the built-in
|
|
=regexp-builder= or the third-party package =visual-regexp=. This demo
|
|
will rely on the latter.
|
|
|
|
If you have the manual you can run =C-h r i regexp= to get to the
|
|
relevant chapter. *Do it!*
|
|
|
|
** Line boundaries
|
|
|
|
The caret =^= denotes the beginning of the line.
|
|
|
|
The dollar sign =$= marks the end.
|
|
|
|
Match all lines that start with a space:
|
|
|
|
Emacs
|
|
Emacs
|
|
Emacs
|
|
Emacs
|
|
Emacs
|
|
|
|
And all that end with a capital =S=:
|
|
|
|
emacs emacS
|
|
emacS emacs
|
|
emacs emacs
|
|
emacS emacS
|
|
|
|
** Remove or keep lines
|
|
|
|
Remove the empty lines. Then keep the ones that contain "username".
|
|
|
|
<username><![CDATA[name]]></username>
|
|
emacs emacS
|
|
emacS emacs
|
|
emacs emacs
|
|
emacS emacS
|
|
|
|
|
|
|
|
<userName><![CDATA[nom]]></userName>
|
|
emacs emacS
|
|
emacS emacs
|
|
emacs emacs
|
|
emacS emacS
|
|
|
|
|
|
|
|
|
|
<username><![CDATA[name]]></username>
|
|
emacs emacS
|
|
emacS emacs
|
|
emacs emacs
|
|
emacS emacS
|
|
|
|
** The dot character
|
|
|
|
The dot or full stop =.= means matches every character except the
|
|
newline.
|
|
|
|
Match these words using their common part =ired= as a string.
|
|
|
|
dired
|
|
fired
|
|
mired
|
|
tired
|
|
wired
|
|
|
|
** Character sets and ranges
|
|
|
|
A set of individual characters is marked between brackets =[]=.
|
|
|
|
Sets can be written as ranges:
|
|
|
|
| Range | Scope |
|
|
|------------+--------------------------------------------|
|
|
| [a-z] | all lower cases alphabetic characters |
|
|
| [A-Za-z] | all upper or lower case letters |
|
|
| [a-z0-9] | lower case alphabet or numbers 0 through 9 |
|
|
| [abcd1234] | letters a,b,c,d and numbers 1,2,3,4 |
|
|
|
|
Match both of those using a character set for the first letter:
|
|
|
|
emacs
|
|
Emacs
|
|
|
|
Match those that end with a number:
|
|
|
|
Emacs
|
|
emacs-27
|
|
emacs-26
|
|
GNU emacs
|
|
|
|
** Difference between postfix operators ?, +, *
|
|
|
|
"Postfix" means that it comes after a given set and alters its scope.
|
|
|
|
=?= match the previous term zero or one time.
|
|
=+= match the previous term one or more times.
|
|
=*= match the previous term zero or as many times as possible.
|
|
|
|
Match the =s= optionally:
|
|
|
|
day
|
|
days
|
|
|
|
Use =prote= followed by a postfix:
|
|
|
|
prot
|
|
prote
|
|
proteeee
|
|
|
|
** Grouped matches
|
|
|
|
A group is enclosed inside escaped parentheses =\(GROUP\)=.
|
|
|
|
Match both of these, including the optional suffix =ig=:
|
|
|
|
conf
|
|
config
|
|
|
|
** Greedy versus non-greedy
|
|
|
|
Postfix charaacter are greedy by default. "Greedy" matches the
|
|
longest possible part. Whereas "non-greedy" corresponds to the
|
|
shortest.
|
|
|
|
A non-greedy variant is used when the postfix is followed by =?=.
|
|
|
|
Using the =.*= construct, match items both greedily and not:
|
|
|
|
Hello world
|
|
Hello world world world world
|
|
|
|
** Multiple groups
|
|
|
|
Match the alphabetic and numeric parts in two separate groups.
|
|
|
|
emacs27
|
|
emacs26
|
|
emacs25
|
|
emacs24
|
|
|
|
** Literal hyphen and dot
|
|
|
|
Match the hyphen as part of the alphabetic group and the dot as part
|
|
of the numeric one.
|
|
|
|
emacs-27.1
|
|
emacs-26.3
|
|
emacs-25.2
|
|
|
|
** Exclude sets
|
|
|
|
To exclude a set you prepend a caret sign: =[^SET]=
|
|
|
|
Match every line except those that start with a capital letter.
|
|
|
|
GNU
|
|
Emacs
|
|
org-mode
|
|
regexp
|
|
emacs_lisp
|
|
Linux
|
|
guix
|
|
|
|
** Alternative groups with literal brackets
|
|
|
|
Use a character sets that matches =name= and =nom=.
|
|
|
|
name
|
|
nom
|
|
|
|
Then:
|
|
|
|
1. Match the =username= variants' =[name]= or =[nom]=.
|
|
2. Replace the match with =[PROT]=.
|
|
|
|
|
|
<username><![CDATA[name]]></username>
|
|
<nameuser><![CDATA[nam]]></nameuser>
|
|
<userName><![CDATA[nom]]></userName>
|
|
<nameuser><![CDATA[nome]]></nameuser>
|
|
|
|
** Either match
|
|
|
|
To target either set, use =\|=.
|
|
|
|
Prepend =vr/= to the first =group= and =match= on each line.
|
|
|
|
`(group-0 ((group (:inherit modus-theme-intense-blue))))
|
|
`(group-1 ((group (:inherit modus-theme-intense-magenta))))
|
|
`(group-2 ((group (:inherit modus-theme-intense-green))))
|
|
`(match-0 ((match (:inherit modus-theme-refine-yellow))))
|
|
`(match-1 ((match (:inherit modus-theme-refine-yellow))))
|
|
|
|
** Running elisp functions on groups
|
|
|
|
Run elisp by escaping the comma =\,= and then following it with a symbol
|
|
inside parentheses: =\,(FUNCTION)=.
|
|
|
|
Using the =.ired= pattern from earlier, run a replace command where you
|
|
must execute the =upcase= function on the second/middle match. Keep the
|
|
rest in tact.
|
|
|
|
direddireddired
|
|
firedfiredfired
|
|
miredmiredmired
|
|
tiredtiredtired
|
|
wiredwiredwired
|
|
</code></pre>
|
|
|
|
|