浅析 org babel switches

2024-03-01 2024-03-09History1183 字

本文以分析函数 org-element-src-block-parser 为例,简要介绍了 Org-mode 代码块中 switches 的作用和用法。

什么是 switches ?

switches#+BEGIN_SRC 一栏上的可选参数,其具体的位置如下:

#+BEGIN_SRC <language> <switches> <header arguments>
  <body>
#+END_SRC

它的作用是提供更精细的代码导出控制 。Org Manual 在 Literal Examples 一节介绍几个 switch 及其用法,这里做个简单的总结:

-n
在导出时给代码块添加行号。默认第一行的行号是 1 ,如果提供一个数字 a(如 -n 10),则第一行的行号是 a 。
+n
在导出时给代码块添加行号。但与 -n 不同的是,它是从上一个有行号的代码块的结束行开始计数,比如说上一个代码块的结束行号是 10 ,那么当前代码块就从 11 开始计数。+n 同样也接受一个可选的数字参数,作用与 -n 类似。
-l
为当前代码块设置 org-coderef-label-format
-r
导出时移除代码块中的引用标签(coderef label)。如果和 -n 一起用,则会把引用标签移到行号上(在 HTML 中就是设置行号所在元素的 id 属性)。
-i
导出时不修改代码块中的缩进。
-k
导出时保留代码块中的引用标签(coderef label)。

使用分享

只是从纸面上讨论 switches 是非常枯燥的,同时也不容易让人明白其具体作用。所以接下来让我们实际体验一下 switches 的使用。我们以 org-element-src-block-parser 为例,分析一下 switches 是如何被解析的:

org-element-src-block-parser 主要由regexp-matchswitches-analysisrestult-construct三个部分构成 :

 1: (defun org-element-src-block-parser (limit affiliated)
 2:   "Parse a source block.
 3: 
 4: LIMIT bounds the search.  AFFILIATED is a list of which CAR is
 5: the buffer position at the beginning of the first affiliated
 6: keyword and CDR is a plist of affiliated keywords along with
 7: their value.
 8: 
 9: Return a list whose CAR is `src-block' and CDR is a plist
10: containing `:language', `:switches', `:parameters', `:begin',
11: `:end', `:number-lines', `:retain-labels', `:use-labels',
12: `:label-fmt', `:preserve-indent', `:value', `:post-blank' and
13: `:post-affiliated' keywords.
14: 
15: Assume point is at the beginning of the block."
16:   (let ((case-fold-search t))
17:     (if (not (save-excursion (re-search-forward "^[ \t]*#\\+END_SRC[ \t]*$"
18:                                                 limit t)))
19:         ;; Incomplete block: parse it as a paragraph.
20:         (org-element-paragraph-parser limit affiliated)
21:       (let ((contents-end (match-beginning 0)))
22:         (save-excursion
23:           (let* (
24:                  <<regexp-match>>
25:                  <<switches-analysis>>
26:                  ;; Retrieve code.
27:                  (value (org-unescape-code-in-string
28:                          (buffer-substring-no-properties
29:                           (line-beginning-position 2) contents-end)))
30:                  (pos-before-blank (progn (goto-char contents-end)
31:                                           (forward-line)
32:                                           (point)))
33:                  ;; Get position after ending blank lines.
34:                  (end (progn (skip-chars-forward " \r\t\n" limit)
35:                              (if (eobp) (point) (line-beginning-position)))))
36:             <<restult-construct>>))))))
 1: (defun org-element-src-block-parser (limit affiliated)
 2:   "Parse a source block.
 3: 
 4: LIMIT bounds the search.  AFFILIATED is a list of which CAR is
 5: the buffer position at the beginning of the first affiliated
 6: keyword and CDR is a plist of affiliated keywords along with
 7: their value.
 8: 
 9: Return a list whose CAR is `src-block' and CDR is a plist
10: containing `:language', `:switches', `:parameters', `:begin',
11: `:end', `:number-lines', `:retain-labels', `:use-labels',
12: `:label-fmt', `:preserve-indent', `:value', `:post-blank' and
13: `:post-affiliated' keywords.
14: 
15: Assume point is at the beginning of the block."
16:   (let ((case-fold-search t))
17:     (if (not (save-excursion (re-search-forward "^[ \t]*#\\+END_SRC[ \t]*$"
18:                                                 limit t)))
19:         ;; Incomplete block: parse it as a paragraph.
20:         (org-element-paragraph-parser limit affiliated)
21:       (let ((contents-end (match-beginning 0)))
22:         (save-excursion
23:           (let* (
24:                  (begin (car affiliated))
25:                  (post-affiliated (point))
26:                  ;; Get language as a string.
27:                  (language
28:                   (progn
29:                     (looking-at
30:                      "^[ \t]*#\\+BEGIN_SRC\
31:                    \\(?: +\\(\\S-+\\)\\)?\
32:                    \\(\\(?: +\\(?:-\\(?:l \".+\"\\|[ikr]\\)\\|[-+]n\\(?: *[0-9]+\\)?\\)\\)+\\)?\
33:                    \\(.*\\)[ \t]*$")
34:                     (match-string-no-properties 1)))
35:                  ;; Get switches.
36:                  (switches (match-string-no-properties 2)) ;;                 (ref:get-switches)
37:                  ;; Get parameters.
38:                  (parameters (match-string-no-properties 3))
39:                  ;; Switches analysis.
40:                  (number-lines
41:                   (and switches
42:                        (string-match "\\([-+]\\)n\\(?: *\\([0-9]+\\)\\)?\\>"
43:                                      switches)
44:                        (cons
45:                         (if (equal (match-string 1 switches) "-")
46:                             'new
47:                           'continued)
48:                         (if (not (match-end 2)) 0
49:                           ;; Subtract 1 to give number of lines before
50:                           ;; first line.
51:                           (1- (string-to-number (match-string 2 switches)))))))
52:                  (preserve-indent
53:                   (and switches
54:                        (string-match "-i\\>" switches)))
55:                  (label-fmt
56:                   (and switches
57:                        (string-match "-l +\"\\([^\"\n]+\\)\"" switches)
58:                        (match-string 1 switches)))
59:                  ;; Should labels be retained in (or stripped from)
60:                  ;; source blocks?
61:                  (retain-labels
62:                   (or (not switches)
63:                       (not (string-match "-r\\>" switches))
64:                       (and number-lines (string-match "-k\\>" switches))))
65:                  ;; What should code-references use - labels or
66:                  ;; line-numbers?
67:                  (use-labels
68:                   (or (not switches)
69:                       (and retain-labels
70:                            (not (string-match "-k\\>" switches)))))
71:                  ;; Retrieve code.
72:                  (value (org-unescape-code-in-string
73:                          (buffer-substring-no-properties
74:                           (line-beginning-position 2) contents-end)))
75:                  (pos-before-blank (progn (goto-char contents-end)
76:                                           (forward-line)
77:                                           (point)))
78:                  ;; Get position after ending blank lines.
79:                  (end (progn (skip-chars-forward " \r\t\n" limit)
80:                              (if (eobp) (point) (line-beginning-position)))))
81:             (list 'src-block
82:                   (nconc
83:                    (list :language language
84:                          :switches (and (org-string-nw-p switches)
85:                                         (org-trim switches))
86:                          :parameters (and (org-string-nw-p parameters)
87:                                           (org-trim parameters))
88:                          :begin begin
89:                          :end end
90:                          :number-lines number-lines
91:                          :preserve-indent preserve-indent
92:                          :retain-labels retain-labels
93:                          :use-labels use-labels
94:                          :label-fmt label-fmt
95:                          :value value
96:                          :post-blank (count-lines pos-before-blank end)
97:                          :post-affiliated post-affiliated)
98:                    (cdr affiliated)))))))))

这几个部分的实现比较简单,因此我们将重点放在 switches 的导出效果上面。我们来逐个分析:

24: (begin (car affiliated))
25: (post-affiliated (point))
26: ;; Get language as a string.
27: (language
28:  (progn
29:    (looking-at
30:     "^[ \t]*#\\+BEGIN_SRC\
31:   \\(?: +\\(\\S-+\\)\\)?\
32:   \\(\\(?: +\\(?:-\\(?:l \".+\"\\|[ikr]\\)\\|[-+]n\\(?: *[0-9]+\\)?\\)\\)+\\)?\
33:   \\(.*\\)[ \t]*$")
34:    (match-string-no-properties 1)))
35: ;; Get switches.
36: (switches (match-string-no-properties 2)) ;; (get-switches)
37: ;; Get parameters.
38: (parameters (match-string-no-properties 3))

(此代码块的 switches-n 24 ,使代码块从第 24 行开始计数,以便符合 org-element-src-block-parser 中的情况。)

这部分的代码很清晰,就是通过正则匹配来得出代码块所用语言、switches 和代码块的其他参数。我们感兴趣的是 coderefget-switches

当我们把光标移到 get-switches 上面时,会发现代码块中对应的一行也被高亮了。这是通过 (setq org-html-head-include-scripts t) 实现的,它会把变量 org-html-scripts 中的 js 注入网页,实现高亮的效果

39: ;; Switches analysis.
40: (number-lines
41:  (and switches
42:       (string-match "\\([-+]\\)n\\(?: *\\([0-9]+\\)\\)?\\>"
43:                     switches)
44:       (cons
45:        (if (equal (match-string 1 switches) "-")
46:            'new
47:          'continued)
48:        (if (not (match-end 2)) 0
49:          ;; Subtract 1 to give number of lines before
50:          ;; first line.
51:          (1- (string-to-number (match-string 2 switches)))))))
52: (preserve-indent
53:  (and switches
54:       (string-match "-i\\>" switches)))
55: (label-fmt
56:  (and switches
57:       (string-match "-l +\"\\([^\"\n]+\\)\"" switches)
58:       (match-string 1 switches)))
59: ;; Should labels be retained in (or stripped from)
60: ;; source blocks?
61: (retain-labels
62:  (or (not switches)
63:      (not (string-match "-r\\>" switches))
64:      (and number-lines (string-match "-k\\>" switches))))
65: ;; What should code-references use - labels or
66: ;; line-numbers?
67: (use-labels
68:  (or (not switches)
69:      (and retain-labels
70:           (not (string-match "-k\\>" switches)))))

(此代码块的 switches+n -r -l ";; (ref:%s)"

这一部分的工作是解析代码块中的 switches 。也基本上都是通过正则匹配来获得相应 的信息。第 405255 行分别对应 nil 三个switches 。第 6167 行则是考虑多个 switches 一起用的情况。第 61 行的 retain-labels 指是否保留代码块中的引用标签,如果有-k 则一定保留,其优先级最高,然后如果没有 -k 但是有 -r 就删除代码块中的引用标签。第 67 行的 use-labels 则是判断是在链接中使用引用标签还是使用行号。

我们回到 switches , 这里行号用了 +n ,这是因为在 org-element-src-block-parserregexp-matchswitches-analysis 两部分是连在一起的,所以可以直接连起来。另外使用-r 来去除代码块中的引用标签,链接中改用行号来显示。注意这个工作是导出时 Org-mode 帮我们完成的,在 org 源文件中,代码引用的格式是一样的。

81: (list 'src-block
82:       (nconc
83:        (list :language language
84:              :switches (and (org-string-nw-p switches)
85:                             (org-trim switches))
86:              :parameters (and (org-string-nw-p parameters)
87:                               (org-trim parameters))
88:              :begin begin
89:              :end end
90:              :number-lines number-lines
91:              :preserve-indent preserve-indent
92:              :retain-labels retain-labels
93:              :use-labels use-labels
94:              :label-fmt label-fmt
95:              :value value
96:              :post-blank (count-lines pos-before-blank end)
97:              :post-affiliated post-affiliated)
98:        (cdr affiliated)))

(此代码块的 switches-n 81

最后就是把结果组合成一个 List 返回。

总结

如果要写技术博客(笔记)的话,代码块是非常重要的一个环节,而 Org-mode 在这两方面都有极其优秀的支持。本文所介绍的内容仅仅是九牛一毛,只希望读者在阅读完毕后能够对 Org-mode 代码块中 switches 有更进一步的了解,感谢阅读!

Footnotes:

1

switches 也可用于 #+BEGIN_EXAMPLE

2

原文是 "Switches provide finer control of the code execution, export, and format." 但是我没搞懂 switches 在代码执行的时候会有什么影响,只知道org-babel--expand-body 会在代码执行的时候移除 coderef 。这可能需要进一步分析 ob 的源码。

3

代码块中如果有 <<xxx>> 之类的文本,可以点击代码块上的 expand 按钮展开,也可以直接点击 <<xxx>> 跳转到定义位置。

4

需要 CSS 文件中设置 .code-highlighted 的样式。


Author: Eli Qian Email: eli.q.qian@gmail.com Create Date: 2024-03-01 Last modified: 2024-03-09 Creator: Emacs 29.2 (Org mode 9.6.15)