A brief look at the internal structure of Clojure Zippers

Clojure Zippers is a library for navigating and modifying tree data structures. While refactoring Cryogen, I needed an operation not supported out of the box (the removal of all nodes to the right of the current one) and thus had to learn a bit about the internal structure of zippers. I record it here for posterity.

A zipper - called loc throughout the code - is simply a tuple of [current-node path-and-state-info] with some metadata. You create a zipper/loc using one of the built-in builders such as xml-zip (if inner node = map, children at :content), vector-zip, seq-zip or using zipper providing your own branch?, children, and make-node functions. And you get the tree back via (node (root loc)).

Remember the distinction between a node and a loc, it will be important to know which one you are working with and different functions return either the one or the other. (You can get the node from a loc using (node loc).) Occasionally I will also use path to refer to the second member of loc, i.e. the path-and-state-info above.

Note
Assume (require '[clojure.zip :refer :all]) when reading code snippets here.

In a loc, current-node is the actual node in the original tree (e.g. a map in a xml-zip tree) and path-and-state-info is a map containing among others:

  • :l - a vector of nodes to the left (leftmost = first; conj to add to end)

  • :r - a vector (or, sometimes, seq?) of nodes to the right (closest = first, rightmost = last)

  • :pnodes (available via (path loc) - a seq of nodes leading to this loc from the root

  • :ppath - parent’s path (= (second (up loc)))

  • changed? - whether the node has been changed (e.g. by insert-left) - used during (up), see below

Let’s see how these are used in a few key functions.

Code examples

up

up (used e.g. by root) is crucial because that is the place where nodes are updated to reflect any changes made to the tree - the parent loc and node are replaced using make-node based on the (possibly modified) :l, node, and :r to reflect any changes to its children:

(defn up
  "Returns the loc of the parent of the node at this loc, or nil if at
  the top"
  {:added "1.0"}
  [loc]
    (let [[node {l :l, ppath :ppath, pnodes :pnodes r :r, changed? :changed?, :as path}] loc]
      (when pnodes
        (let [pnode (peek pnodes)]
          (with-meta (if changed?
                       [(make-node loc pnode (concat l (cons node r)))
                        (and ppath (assoc ppath :changed? true))]
                       [pnode ppath])
                     (meta loc))))))

xml-zip

Create a zipper from a tree with map nodes and children being a vector under :content:

(defn xml-zip
  "Returns a zipper for xml elements (as from xml/parse),
  given a root element"
  {:added "1.0"}
  [root]
    (zipper (complement string?) ; branch?
            (comp seq :content)  ; children
            (fn [node children]  ; make-node
              (assoc node :content (and children (apply vector children))))
            root))

Notes

Functions that change the tree operate mostly on the loc and change its :l, :r etc. as appropriate; the change gets reflected to the node only during up - e.g. insert-left/right. Functions that operate on the children such as insert-child change the parent node (and loc) at once.

See also

  • tupelo.forest is a library for searching & manipulating tree-like data structures (examples)

  • nathanmarz/specter - querying and manipulation of complex, nested data structures - including trees - made easy


Tags: clojure library


Copyright © 2020 Jakub Holý
Powered by Cryogen
Theme by KingMob