emacs/var/elfeed/db/data/d8/d8b326abc8e41187390539fafedef4ceb09c7e7b
2022-01-03 12:49:32 -06:00

295 lines
17 KiB
Plaintext

<p>
I recently took over the maintenance of <a href="https://github.com/sachac/subed">subed</a>, an Emacs mode for
editing subtitles. One of the things on my TODO list was to figure out
how to handle generic and format-specific functions instead of relying
on defalias. For example, there are SubRip files (.srt), WebVTT files
(.vtt), and Advanced SubStation Alpha (.ass). I also want to add
support for Audacity labels and other formats.
</p>
<p>
There are some functions that will work across all of them once you
have the appropriate format-specific functions in place, and there are
some functions that have to be very different depending on the format
that you're working with. Now, how do you do those things in Emacs
Lisp? There are several ways of making general functions and specific
functions.
</p>
<p>
For example, the <code>forward-paragraph</code> and <code>backward-paragraph</code> commands
use variables to figure out the paragraph separators, so buffer-local
variables can change the behaviour.
</p>
<p>
However, I needed a bit more than regular expressions. An approach
taken in some packages like <a href="https://github.com/Fuco1/smartparens/blob/master/smartparens.el">smartparens</a> is to have buffer-local
variables have the actual functions to be called, like
<code>sp-forward-bound-fn</code> and <code>sp-backward-bound-fn</code>.
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">defvar-local</span> <span class="org-variable-name">sp-forward-bound-fn</span> nil
<span class="org-doc">"Function to restrict the forward search"</span>)
(<span class="org-keyword">defun</span> <span class="org-function-name">sp--get-forward-bound</span> ()
<span class="org-doc">"Get the bound to limit the forward search for looking for pairs.</span>
<span class="org-doc">If it returns nil, the original bound passed to the search</span>
<span class="org-doc">function will be considered."</span>
(<span class="org-keyword">and</span> sp-forward-bound-fn (funcall sp-forward-bound-fn)))
</pre>
</div>
<p>
Since there were so many functions, I figured that might be a little
bit unwieldy. In <a href="https://orgmode.org/worg/dev/org-export-reference.html">Org mode</a>, custom export backends are structs that
have an alist that maps the different types of things to the functions
that will be called, overriding the functions that are defined in the
parent export backend.
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">cl-defstruct</span> (<span class="org-type">org-export-backend</span> (<span class="org-builtin">:constructor</span> org-export-create-backend)
(<span class="org-builtin">:copier</span> nil))
name parent transcoders options filters blocks menu)
(<span class="org-keyword">defun</span> <span class="org-function-name">org-export-get-all-transcoders</span> (backend)
<span class="org-doc">"Return full translation table for BACKEND.</span>
<span class="org-doc">BACKEND is an export back-end, as return by, e.g,,</span>
<span class="org-doc">`</span><span class="org-doc"><span class="org-constant">org-export-create-backend</span></span><span class="org-doc">'. Return value is an alist where</span>
<span class="org-doc">keys are element or object types, as symbols, and values are</span>
<span class="org-doc">transcoders.</span>
<span class="org-doc">Unlike to `</span><span class="org-doc"><span class="org-constant">org-export-backend-transcoders</span></span><span class="org-doc">', this function</span>
<span class="org-doc">also returns transcoders inherited from parent back-ends,</span>
<span class="org-doc">if any."</span>
(<span class="org-keyword">when</span> (symbolp backend) (<span class="org-keyword">setq</span> backend (org-export-get-backend backend)))
(<span class="org-keyword">when</span> backend
(<span class="org-keyword">let</span> ((transcoders (org-export-backend-transcoders backend))
parent)
(<span class="org-keyword">while</span> (<span class="org-keyword">setq</span> parent (org-export-backend-parent backend))
(<span class="org-keyword">setq</span> backend (org-export-get-backend parent))
(<span class="org-keyword">setq</span> transcoders
(append transcoders (org-export-backend-transcoders backend))))
transcoders)))
</pre>
</div>
<p>
The export code looked a little bit complicated, though. I wanted to
see if there was a different way of doing things, and I came across
<code>cl-defmethod</code>. Actually, the first time I tried to implement this, I
was focused on the fact that <code>cl-defmethod</code> could call different
things depending on the class that you give it. So initially I had
created a couple of classes: <code>subed-backend</code> class, and then
subclasses such as <code>subed-vtt-backend</code>. This allowed me to store the
backend as a buffer-local variable and differentiate based on that.
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">require</span> '<span class="org-constant">eieio</span>)
(<span class="org-keyword">defclass</span> <span class="org-type">subed-backend</span> ()
((regexp-timestamp <span class="org-builtin">:initarg</span> <span class="org-builtin">:regexp-timestamp</span>
<span class="org-builtin">:initform</span> <span class="org-string">""</span>
<span class="org-builtin">:type</span> string
<span class="org-builtin">:custom</span> string
<span class="org-builtin">:documentation</span> <span class="org-doc">"Regexp matching a timestamp."</span>)
(regexp-separator <span class="org-builtin">:initarg</span> <span class="org-builtin">:regexp-separator</span>
<span class="org-builtin">:initform</span> <span class="org-string">""</span>
<span class="org-builtin">:type</span> string
<span class="org-builtin">:custom</span> string
<span class="org-builtin">:documentation</span> <span class="org-doc">"Regexp matching the separator between subtitles."</span>))
<span class="org-doc">"A class for data and functions specific to a subtitle format."</span>)
(<span class="org-keyword">defclass</span> <span class="org-type">subed-vtt-backend</span> (subed-backend) nil
<span class="org-doc">"A class for WebVTT subtitle files."</span>)
(<span class="org-keyword">cl-defmethod</span> <span class="org-function-name">subed--timestamp-to-msecs</span> ((backend subed-vtt-backend) time-string)
<span class="org-doc">"Find HH:MM:SS,MS pattern in TIME-STRING and convert it to milliseconds.</span>
<span class="org-doc">Return nil if TIME-STRING doesn't match the pattern.</span>
<span class="org-doc">Use the format-specific function for BACKEND."</span>
(<span class="org-keyword">save-match-data</span>
(<span class="org-keyword">when</span> (string-match (<span class="org-keyword">oref</span> backend regexp-timestamp) time-string)
(<span class="org-keyword">let</span> ((hours (string-to-number (match-string 1 time-string)))
(mins (string-to-number (match-string 2 time-string)))
(secs (string-to-number (match-string 3 time-string)))
(msecs (string-to-number (subed--right-pad (match-string 4 time-string) 3 ?0))))
(+ (* (truncate hours) 3600000)
(* (truncate mins) 60000)
(* (truncate secs) 1000)
(truncate msecs))))))
</pre>
</div>
<p>
Then I found out that <a href="https://stackoverflow.com/questions/60244133/how-to-add-a-new-specializer-to-cl-defmethod-apply-to-multiple-major-modes">you can use <code>major-mode</code> as a context specifier</a>
for <code>cl-defmethod</code>, so you can call different specific functions
depending on the major mode that your buffer is in. It doesn't seem to
be mentioned in the <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Generic-Functions.html">elisp manual</a>, so at some point I should figure out
how to suggest mentioning it. Anyway, now I have some functions that
get called if the buffer is in <code>subed-vtt-mode</code> and some functions
that get called if the buffer is in <code>subed-srt-mode</code>.
</p>
<p>
The catch is that <code>cl-defmethod</code> <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Generic-Functions.html">can't define interactive functions</a>.
So if I'm defining a command, an interactive function that can be
called with M-x, then I will need to have a regular function that
calls the function defined with <code>cl-defmethod</code>. This resulted in a bit
of duplicated code, so I have a macro that defines the method and then
defines the possibly interactive command that calls that method. I
didn't want to think about whether something was interactive or not,
so my macro just always creates those two functions. One is a
<code>cl-defmethod</code> that I can override for a specific major mode, and one
is the function that actually calls it, which may may not be
interactive. It doesn't handle <code>&amp;rest</code> args, but I don't have any in
<code>subed.el</code> at this time.
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">defmacro</span> <span class="org-function-name">subed-define-generic-function</span> (name args <span class="org-type">&amp;rest</span> body)
<span class="org-doc">"Declare an object method and provide the old way of calling it."</span>
(<span class="org-keyword">declare</span> (indent 2))
(<span class="org-keyword">let</span> (is-interactive
doc)
(<span class="org-keyword">when</span> (stringp (car body))
(<span class="org-keyword">setq</span> doc (<span class="org-keyword">pop</span> body)))
(<span class="org-keyword">setq</span> is-interactive (eq (caar body) 'interactive))
`(<span class="org-keyword">progn</span>
(<span class="org-keyword">cl-defgeneric</span> ,(intern (concat <span class="org-string">"subed--"</span> (symbol-name name)))
,args
,doc
,@(<span class="org-keyword">if</span> is-interactive
(cdr body)
body))
,(<span class="org-keyword">if</span> is-interactive
`(<span class="org-keyword">defun</span> ,(intern (concat <span class="org-string">"subed-"</span> (symbol-name name))) ,args
,(concat doc <span class="org-string">"\n\nThis function calls the generic function `"</span>
(concat <span class="org-string">"subed--"</span> (symbol-name name)) <span class="org-string">"' for the actual implementation."</span>)
,(car body)
(,(intern (concat <span class="org-string">"subed--"</span> (symbol-name name)))
,@(delq nil (mapcar (<span class="org-keyword">lambda</span> (a)
(<span class="org-keyword">unless</span> (string-match <span class="org-string">"^&amp;"</span> (symbol-name a))
a))
args))))
`(<span class="org-keyword">defalias</span> (<span class="org-keyword">quote</span> ,(intern (concat <span class="org-string">"subed-"</span> (symbol-name name))))
(<span class="org-keyword">quote</span> ,(intern (concat <span class="org-string">"subed--"</span> (symbol-name name))))
,doc)))))
</pre>
</div>
<p>
For example, the function:
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">subed-define-generic-function</span> timestamp-to-msecs (time-string)
<span class="org-string">"Find timestamp pattern in TIME-STRING and convert it to milliseconds.</span>
<span class="org-string">Return nil if TIME-STRING doesn't match the pattern."</span>)
</pre>
</div>
<p>
expands to:
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">progn</span>
(<span class="org-keyword">cl-defgeneric</span> <span class="org-function-name">subed--timestamp-to-msecs</span>
(time-string)
<span class="org-doc">"Find timestamp pattern in TIME-STRING and convert it to milliseconds.</span>
<span class="org-doc">Return nil if TIME-STRING doesn't match the pattern."</span>)
(<span class="org-keyword">defalias</span> '<span class="org-function-name">subed-timestamp-to-msecs</span> 'subed--timestamp-to-msecs <span class="org-doc">"Find timestamp pattern in TIME-STRING and convert it to milliseconds.</span>
<span class="org-doc">Return nil if TIME-STRING doesn't match the pattern."</span>))
</pre>
</div>
<p>
and the interactive command defined with:
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">subed-define-generic-function</span> forward-subtitle-end ()
<span class="org-string">"Move point to end of next subtitle.</span>
<span class="org-string">Return point or nil if there is no next subtitle."</span>
(<span class="org-keyword">interactive</span>)
(<span class="org-keyword">when</span> (subed-forward-subtitle-id)
(subed-jump-to-subtitle-end)))
</pre>
</div>
<p>
expands to:
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">progn</span>
(<span class="org-keyword">cl-defgeneric</span> <span class="org-function-name">subed--forward-subtitle-end</span> nil <span class="org-doc">"Move point to end of next subtitle.</span>
<span class="org-doc">Return point or nil if there is no next subtitle."</span>
(<span class="org-keyword">when</span>
(subed-forward-subtitle-id)
(subed-jump-to-subtitle-end)))
(<span class="org-keyword">defun</span> <span class="org-function-name">subed-forward-subtitle-end</span> nil <span class="org-doc">"Move point to end of next subtitle.</span>
<span class="org-doc">Return point or nil if there is no next subtitle.</span>
<span class="org-doc">This function calls the generic function `</span><span class="org-doc"><span class="org-constant">subed--forward-subtitle-end</span></span><span class="org-doc">' for the actual implementation."</span>
(<span class="org-keyword">interactive</span>)
(subed--forward-subtitle-end)))
</pre>
</div>
<p>
Then I can define a specific one with:
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">cl-defmethod</span> <span class="org-function-name">subed--timestamp-to-msecs</span> (time-string <span class="org-type">&amp;context</span> (major-mode subed-srt-mode))
<span class="org-doc">"Find HH:MM:SS,MS pattern in TIME-STRING and convert it to milliseconds.</span>
<span class="org-doc">Return nil if TIME-STRING doesn't match the pattern.</span>
<span class="org-doc">Use the format-specific function for MAJOR-MODE."</span>
(<span class="org-keyword">save-match-data</span>
(<span class="org-keyword">when</span> (string-match subed--regexp-timestamp time-string)
(<span class="org-keyword">let</span> ((hours (string-to-number (match-string 1 time-string)))
(mins (string-to-number (match-string 2 time-string)))
(secs (string-to-number (match-string 3 time-string)))
(msecs (string-to-number (subed--right-pad (match-string 4 time-string) 3 ?0))))
(+ (* (truncate hours) 3600000)
(* (truncate mins) 60000)
(* (truncate secs) 1000)
(truncate msecs))))))
</pre>
</div>
<p>
The upside is that it's easy to either override or extend a function's
behavior. For example, after I sort subtitles, I want to renumber them
if I'm in an SRT buffer because SRT subtitles have numeric IDs. This
doesn't happen in any of the other modes. So I can just define that
this bit of code runs after the regular code that runs.
</p>
<div class="org-src-container">
<pre class="src src-emacs-lisp">(<span class="org-keyword">cl-defmethod</span> <span class="org-function-name">subed--sort</span> <span class="org-builtin">:after</span> (<span class="org-type">&amp;context</span> (major-mode subed-srt-mode))
<span class="org-string">"Renumber after sorting. Format-specific for MAJOR-MODE."</span>
(subed-srt--regenerate-ids))
</pre>
</div>
<p>
The downside is that going to the function's definition and stepping
through it is a little more complicated because it's hidden behind
this macro and the <code>cl-defmethod</code> infrastructure. I think that if you
<code>describe-function</code> the right function, the internal version with the
<code>--</code>, then it will list the different implementations of it. I added a
note to the regular function's docstring to make it a little easier.
</p>
<p>
I'm going to give this <a href="https://github.com/sachac/subed/tree/derived-mode">derived-mode</a> branch a try for a little while by
subtitling some more EmacsConf talks before I merge it into the main
branch. This is my first time working with <code>cl-defmethod</code>, and it
looks pretty interesting.
</p>