Posted on 13th July 2022

ML libraries for functional and mainstream languages

news-paper Clojure | News |

Before joining Flexiana, I was working as a Clojure developer in a company that does analytics. Many colleagues were data scientists whose favorite subject of discussion was artificial intelligence. So, for five years, I was listening to them talking about machine learning algorithms, their specific features, and the scenarios they are used for. After the first couple of years, I realized that I became familiar with some of the terms my colleagues were using. Reading an article or watching a video about data science was not that hard anymore. As a matter of fact, it was interesting.

I was really tempted to see how difficult it would be to understand the machine learning algorithms in depth. Were my studies in mathematics and my experience in software development enough? Or do the internals of the ML algorithms would look like a black box? To find the answer to this question, I decided to try implementing these algorithms by looking at books, articles, and videos.

As it turned out, different algorithms require knowledge of different mathematical branches. K-Nearest-Neighbors and K-Means are the easiest to understand and implement because they are mainly based on the geometric concept of distance. Decision Tree and Naive Bayes depend on statistical measures like entropy and conditional probability, respectively. In order for someone to understand the Linear Regression algorithm, they need to be familiar with the matrix operations in algebra. Neural Network requires knowledge of calculus tools like derivatives and it was the most difficult to implement.

As a result of my attempt, two projects came to life. The first one, Emel, is a simple and functional machine-learning library for the Erlang ecosystem. It does what ML algorithms do: it turns data into functions!

# Elixir example
data = [
 {[1.794638, 15.15426], 0.510998918},
 {[3.220726, 229.6516], 105.6583692},
 {[5.780040, 3480.201], 1776.99}
# Turn data into a function
f = :emel@ml@linear_regression.predictor(data)
# Make predictions
f.([3.0, 230.0])
# 106.74114058686602

Emel’s documentation contains a list of the implemented algorithms and examples of their usage.

The second project is Synapses. It is a group of neural-network libraries for functional and mainstream languages. It grants an easy way to create, train, evaluate artificial neural nets, and make predictions with them. What makes Synapses unique, is their ability to be used in several programming languages in a compatible way. The user can create a neural network in Clojure, train it in Python, evaluate it in C#, and get its predictions in Elixir. Another feature of Synapses is the visualization of networks.

Let’s take a look at a minimal example in Clojure:

;; Load the `net` namespace
(require '[clj-synapses.net :as net])
;; Create a random neural network by providing its layer sizes
(def network
  (net/->net [2 3 1]))
;; Make a prediction
(net/predict network [0.2 0.6])
;; Train a neural network
(net/fit network 0.1 [0.2 0.6] [0.9])
;; The `fit` function returns a new neural network
;; with the weights adjusted to a single observation.
;; In practice, for a neural network to be fully trained,
;; it should be fitted with multiple observations,
;; usually by reducing over a collection.

To create, train, evaluate, draw, and make predictions with neural networks in Clojure, feel free to read the documentation.

Synapses are available in 9 programming languages. Clojure, C#, Elixir, F#, Gleam, Java, JavaScript, Python, and Scala. But I didn’t have to implement the library from scratch for all these languages. In many cases, I made use of their interoperability.

If I had to share my experience writing code in the above-mentioned languages, I would divide them into 3 groups. Clojure and Elixir are ideal for enabling the developer to turn ideas into code. Scala, F#, and Gleam are perfect for preventing errors. Python gets tolerable with functional libraries.

Dimos Michailidis