A brief look at the internal structure of Clojure Zippers
Clojure Zippers is a library for navigating and modifying tree data structures. While refactoring Cryogen, I needed an operation not supported out of the box (the removal of all nodes to the right of the current one) and thus had to learn a bit about the internal structure of zippers. I record it here for posterity.
A zipper - called loc throughout the code - is simply a tuple of [current-node path-and-state-info]
with some metadata. You create a zipper/loc using one of the built-in builders such as xml-zip
(if inner node = map, children at :content
), vector-zip
, seq-zip
or using zipper
providing your own branch?
, children
, and make-node
functions. And you get the tree back via (node (root loc))
.
Remember the distinction between a node and a loc, it will be important to know which one you are working with and different functions return either the one or the other. (You can get the node from a loc using (node loc)
.) Occasionally I will also use path to refer to the second member of loc, i.e. the path-and-state-info
above.
Assume (require '[clojure.zip :refer :all]) when reading code snippets here.
|
In a loc, current-node
is the actual node in the original tree (e.g. a map in a xml-zip tree) and path-and-state-info
is a map containing among others:
:l
- a vector of nodes to the left (leftmost = first; conj to add to end):r
- a vector (or, sometimes, seq?) of nodes to the right (closest = first, rightmost = last):pnodes
(available via(path loc)
- a seq of nodes leading to this loc from the root:ppath
- parent’s path (=(second (up loc))
)changed?
- whether the node has been changed (e.g. byinsert-left
) - used during(up)
, see below
Let’s see how these are used in a few key functions.
Code examples
up
up
(used e.g. by root
) is crucial because that is the place where nodes are updated to reflect any changes made to the tree - the parent loc and node are replaced using make-node
based on the (possibly modified) :l
, node, and :r
to reflect any changes to its children:
(defn up
"Returns the loc of the parent of the node at this loc, or nil if at
the top"
{:added "1.0"}
[loc]
(let [[node {l :l, ppath :ppath, pnodes :pnodes r :r, changed? :changed?, :as path}] loc]
(when pnodes
(let [pnode (peek pnodes)]
(with-meta (if changed?
[(make-node loc pnode (concat l (cons node r)))
(and ppath (assoc ppath :changed? true))]
[pnode ppath])
(meta loc))))))
xml-zip
Create a zipper from a tree with map nodes and children being a vector under :content
:
(defn xml-zip
"Returns a zipper for xml elements (as from xml/parse),
given a root element"
{:added "1.0"}
[root]
(zipper (complement string?) ; branch?
(comp seq :content) ; children
(fn [node children] ; make-node
(assoc node :content (and children (apply vector children))))
root))
Notes
Functions that change the tree operate mostly on the loc and change its :l
, :r
etc. as appropriate; the change gets reflected to the node only during up
- e.g. insert-left/right
. Functions that operate on the children such as insert-child
change the parent node (and loc) at once.
See also
tupelo.forest is a library for searching & manipulating tree-like data structures (examples)
nathanmarz/specter - querying and manipulation of complex, nested data structures - including trees - made easy