avatar

Rok Lenarčič

Posted on 20th August 2021

On the Nature of Clojure Protocols

news-paper Case studies | News | Software Development |

History of Protocols

When Clojure 1.0 was released, multi-methods were presented as a big improvement over the likes of Java’s Interfaces and abstract Classes (and similar constructs in other OOP languages). The multiple dispatch over any value that can be calculated by an arbitrary dispatch function were deemed superior to OOP single dispatch, where the class of the “receiver” was dispatch value. It was more general, and a superset in all regards, since the classic single dispatch could be implemented trivially by #(class %1) dispatch function.

The first of Clojure books such as The Joy of Clojure and The Art of Clojure Programming worked off the 1.0 and 1.1 versions and showcased multi-methods front and center. However, soon the multi-methods lost their luster and for several reasons:

  • It turns out that dispatching on the type of argument is what most people ended up doing anyway. People approximated interfaces by having a multimethod argument be a map with some sort of :type key to dispatch on.
  • Using a dispatch fn made multi-methods much slower than straight Java Interface method invocations. This compounds with the list item above.
  • An Interface can specify multiple methods that all have to be implemented, it’s all or nothing. If you have Connection interface with open and close, then you can be sure that any implementation has both those methods. You cannot mandate that 2 or more multi-methods are always implemented for same dispatch value.

Clojure 1.2 introduced Protocols to address this need.

The problem of Protocols

Developers with any experience with languages like Java, C#, C++, Rust, Kotlin will immediately feel a familiarity with the concept. After all, at the first glance, it is just like Interfaces in those languages, except you can extend protocols to types that don’t declare it, such as String and nil, similar to Rust traits.

The problem is that the conceptual similarity to an Interface abstraction will make most developers expect them to behave in similar ways as a typical Interface. The implementation details of Protocols make them behave differently than expected in a few important cases.

Using Java as an example, if you write x instanceof AnInterface, then AnInterface is a syntactic token that is a Type identifier.  A type identifier is an immutable thing, that references the same abstract type wherever in application it is used. This is not true for Clojure’s protocols as we’ll demonstrate.

What happens when you define a Protocol

When you have defprotocol AProtocol in your code, the following will happen:

  • a Var AProtocol will be defonce in the namespace (so the reloads don’t happen)
  • a Clojure immutable map with protocol information will be constructed
  • a Java interface with Protocol methods and their signatures will be dynamically generated and loaded into the classloader
  • the protocol map mentioned above is alter-var-root-ed to be the value of the AProtocol var

So the following code:

(defprotocol AProtocol
  (afn [this a] "A function"))

produces the var AProtocol, which is:

AProtocol
=>
{:on org.clojars.roklenarcic.example.AProtocol,
 :on-interface org.clojars.roklenarcic.example.AProtocol,
 :sigs {:afn {:name afn, :arglists ([this a]), :doc "A function"}},
 :var #'org.clojars.roklenarcic.example/AProtocol,
 :method-map {:afn :afn},
 :method-builders {#'org.clojars.roklenarcic.example/afn #object[org.cloj...}}

(type AProtocol)
=> clojure.lang.PersistentArrayMap

(type (:on AProtocol))
=> clojure.lang.Symbol
(type (:on-interface AProtocol))
=> java.lang.Class

What happens when you extend a Protocol

There are multiple ways of extending a Protocol:

  • implementing the Java Interface generated
  • calling extend (or extend-protocol or extend-type which call extend under the hood)
  • the new (since Clojure 1.10) extend via metadata, where you add protocol method implementations

Each of these has some unexpected issues.

Implementing the Java Interface

This happens when you use reify, defrecord or deftype and specify the protocol.

(defrecord ARecord []
  ex/AProtocol
  (afn [this a] nil))

(supers ARecord)
=>
#{clojure.lang.IPersistentMap
  clojure.lang.IHashEq
  clojure.lang.ILookup
  java.io.Serializable
  java.util.Map
  clojure.lang.IPersistentCollection
  clojure.lang.IObj
  clojure.lang.Seqable
  clojure.lang.Counted
  clojure.lang.IKeywordLookup
  java.lang.Iterable
  org.clojars.roklenarcic.example.AProtocol
  clojure.lang.IMeta
  java.lang.Object
  clojure.lang.Associative
  clojure.lang.IRecord}

This allows the fast execution path, when methods are called, they will be invoked through Java’s Interface mechanism.

Extend

The extend, extend-type, extend-protocol function work differently. The add the implementing type to Protocol map under :impls key. When protocol’s functions are invoked, :impls are checked after the first check if the interface is implemented.

This is slower, but it has the benefit that it can be done on types after they’ve been defined, such as String or a record that didn’t initially implement the protocol.

(defprotocol AProtocol (afn [this a] "A function"))

(extend-protocol AProtocol
  String
  (afn [this a] nil))

AProtocol
=>
{:on org.clojars.roklenarcic.example.AProtocol,
 :on-interface org.clojars.roklenarcic.example.AProtocol,
 :sigs {:afn {:name afn, :arglists ([this a]), :doc "A function"}},
 :var #'org.clojars.roklenarcic.example/AProtocol,
 :method-map {:afn :afn},
 :method-builders {#'org.clojars.roklenarcic.example/afn #object[org.clojars.roklenarcic.example$eval1629 "org.cl..."]},
 :impls {java.lang.String {:afn #object[org.clojars.roklenarcic.example$eval... "org.clo..."]}}}

Extend via metadata

First, the protocol needs to be declared to allow such an extension.

(defprotocol AProtocol
  :extend-via-metadata true
  (afn [this a] "A function"))

(afn ^{`afn (fn [this a] nil)} [] 1)
=> nil

1. Caveat: extend via metadata doesn’t actually satisfy the protocol

Function (extends? protocol atype) will return true if atype Class either extends the Protocol’s Interface or if atype is in Protocol’s :impls map.

The function (satisfies? protocol x) will return true if type of x either extends the Protocol’s interface or the type or any of its supertypes are in Protocol’s :impls map.

;; difference between extends? and satisfies?
(extend-protocol AProtocol
  java.util.Date
  (afn [this a] nil))

(satisfies? AProtocol (java.util.Date.))
=> true
(extends? AProtocol java.util.Date)
=> true

;; java.sql.Timestamp is subclass of java.util.Date
(satisfies? AProtocol (java.sql.Timestamp. 0))
=> true
(extends? AProtocol java.sql.Timestamp)
=> false

Note that the metadata extend mechanism is conspicuously absent. It does not use the Interface or :impls map.

Indeed, extending via metadata will not satisfy a protocol, it only makes the protocol’s functions work on an object with the correct metadata.

(defprotocol AProtocol (afn [this a] "A function"))

(def impl ^{`afn (fn [this a] nil)} [])

;; functions work
(afn ^{`afn (fn [this a] nil)} [] 1)
=> nil
;; but satisfies? doesn't
(satisfies? AProtocol impl)
=> false

2. Caveat: Interaction with namespace reloading

One of the key elements to a REPL-centric workflow common to Clojure is reloading of the namespaces. On reload of defprotocol a new Java Interface with name AProtocol is generated and loaded into the classloader. This has the effect: All existing Records, reify objects that satisfy protocol via Interface implementation stop doing so.

(defprotocol AProtocol (afn [this a] "A function"))

(defrecord ARecord []
  AProtocol
  (afn [this a] nil))

(def record (->ARecord))

(satisfies? AProtocol record)
=> true
(instance? org.clojars.roklenarcic.example.AProtocol record)
=> true

;; evaluate defprotocol again, then retry:

(satisfies? AProtocol record)
=> false
(instance? org.clojars.roklenarcic.example.AProtocol record)
=> false

How did an instance of ARecord magically stop being an instance of org.clojars.roklenarcic.example.AProtocol? It didn’t, that’s impossible in Java, but after evaluating defprotocol a second time there is two Java Interfaces named org.clojars.roklenarcic.example.AProtocol, and instance? Call always refers to the latest one, but ARecord still implements the old one. How to fix this: after reloading the namespace with defprotocol you should also reload the namespace with defrecord or deftype so it extends the new protocol Interface. This fixes newly created ARecord objects.

This doesn’t fix any existing ARecord objects in your application vars or captured scopes.

If you satisfy protocol via extend then you’re safe here.

3. Caveat: Capturing Protocol Var value is problematic

OK, you might be thinking that your production build is safe from such problems because it doesn’t use namespace reloading. Well, there is another class of Protocol errors. It has to do with Protocol value being an immutable map: closing over a Protocol value will prevent any subsequent protocol extensions from being effective.

(defprotocol AProtocol (afn [this a] "A function"))

(:impls AProtocol)
=> nil

(def is-aprotocol? (partial satisfies? AProtocol))

(extend-protocol AProtocol
  String
  (afn [this a] nil))

(:impls AProtocol)
=>
{java.lang.String {:afn #object[org.clojar... 0x26aed7d4 "org.clojar..."]}}

(satisfies? AProtocol "X")
=> true
(is-aprotocol? "X")
=> false

As you can see, is-aprotocol? surprisingly returns false. This is because (partial satisfies? AProtocol) closes over AProtocol value while the :impls key is absent. The subsequent extend-protocol to String has no effect on that predicate. To avoid this effect you need to use #(satisfies? AProtocol %) instead of partial. This doesn’t affect extensions to Protocol via Interface route.

Other issues

If you’re still having issues with mysterious exceptions about objects not satisfying protocols you expect them to, then look for a target folder or similar in your project. Target folders are often on leiningen classpaths for REPL, and tasks like lein uberjar will populate these folders with AOT compiled class files for Protocol’s interfaces. When you run the REPL with these classes on classpath, it can manifest as problems with instanceof not working correctly. Remove AOT compiled class files from your classpath (e.g. by running lein clean) and try again, perhaps your problem disappears.

Hopefully, you’ll find some of these details helpful.