Most of the time, if not always, data is the key factor in the technology world. We transform data into information and use this information to create processes that can improve our lives. However, data tends to grow rapidly, so we need to make critical decisions about data design and study the different relationships.
As Flexianers, we understand the implications of good data design in the technology world. Therefore, we are always trying to stay informed about the different approaches we can use for our solutions. Today, we want to present our perspective on the foundations of Graph Databases, specifically focusing on TerminusDB in this instance.
Let’s start with the basics: what is a Graph Database?
A graph database is a type of database that is designed to store and manage data using graph theory principles. Instead of using tables, rows, and columns like traditional relational databases, graph databases use nodes, edges, and properties to represent and store data.
In the graph database world, nodes represent entities or objects, while the edges represent the relationships between these entities.
In this image, we have three types of nodes: customers, orders, and products. On the other hand, we have edges to represent when a customer creates an order and which orders contain some specific product.
Using this approach we can rapidly verify what are the products most ordered by the customer just traversing the graph and doing some “counting” right? If you noticed this, excellent! You already understood one of the many use cases of graph databases.
We are already using relational databases, why graphs now?
This question came to mind the first time I read about Graph Databases. The relevant factor here is use cases. While the relational approach is still extensively used and solves many different types of problems, there are difficulties that we have to confront when the complexity of relationships between different tables scales up.
As developers, if we remember how efficient it was to traverse a graph or a tree using the algorithms provided by graph theory, I think we can infer the answer.
Although we want to present some of the advantages of using Graph Databases:
- Graph Databases excel at representing and dealing with complex relationships between data entities. It is very intuitive to model various types of relationships, such as connections, associations, and networks using this structure. This makes this approach highly suitable for scenarios with rich and interconnected data.
- Making decisions based on relationship patterns is very efficient in the graph world. We will not need to evaluate results from complex joins between different tables; we only need to traverse the graph.
There are more advantages, but they are specific to the particular DBMS, so we prefer to mention the core ones. After this, we would like to talk a bit about our experience with Graph DBMS, using TerminusDB as our main guest today.
TerminusDB
TerminusDB is an open-source database that utilizes a graph-based approach and integrates a data version control system. To interact with the DBMS, you can use WOQL (Web Object Query Language), which is a datalog-like query language. For more detailed information about WOQL, I invite you to read “The Power of Web Object Query Language” at TerminusDB Blog.
In addition to the features commonly associated with graph databases, TerminusDB brings some unique capabilities:
- Version Control System: TerminusDB incorporates version control principles, allowing users to track and manage changes made to data over time and across different domains. It provides versioning capabilities to manage different versions of data and facilitates branching, merging, and conflict resolution.
- Easy Collaboration: TerminusDB simplifies collaboration by enabling users to share and clone databases. This allows multiple users to work together, make changes, and collaborate on the data with minimal risk of compromising data integrity.
- Succinct Data Structures: TerminusDB leverages succinct data structures, which utilize memory space close to the lower bounds of theoretical space complexity. This optimization ensures efficient query execution while minimizing resource consumption as well.
Now we know a bit more about our guest today, it’s time to start playing!
The easy part! The online dashboard
For this post, we used this way to play with TerminusDB and focus on showing you how to do the basic interaction using Clojure and how to model our basic use case in the database. Although, we will be showing you a bonus gist in the second part of this post to show other mechanisms to install and start with TerminusDB.
The first step to start using the dashboard is accessing this address. Where you will see something like this:
If you don’t have an account follow the sign-up process, else go and log in.
In this part, we invite you to plan a bit with the interface, but the most relevant parts for this post are:
- The dashboard: Here we we can create categories for our schemas. This will bring us the opportunity to create a classification layer and group schemas together.
- The profile section: This is a section where we can use our OpenAI key in order to associate indexes using AI, and we can collect our personal access token to be used with the Javascript/Python clients or via CURL.
The collection of our personal access token is vital here because we will be using it to access our Clojure application. After collecting the token be sure to create an environment variable named “TERMINUSDB_ACCESS_TOKEN” and associate to this token.
The funny part, Clojure(80%) code!
Sadly, while we were writing this post we had zero success finding a native Clojure client(Only Python and Javascript clients).
But, don’t worry, Clojure always provides amazing solutions for our problems! So we decided to use libpython-clj for this purpose and, guess what? It worked beautifully! To be honest, we finished using like 80% Clojure,15% Python, and 5% WOQL. Anyways, let’s describe the process.
First of all, we need to install the terminusdb python client, so the following command is enough(If you don’t have python3 installed, please go and install it):
python3 -m pip install terminusdb-client
After this, we proceed to add the libpython-clj dependency which can be found here in our project.clj file:
...
:dependencies [[org.clojure/clojure "1.11.1"]
[clj-http "3.12.3"]
[metosin/jsonista "0.3.8" [clj-python/libpython-clj "2.025"]]
...
With this addition, we can start the interaction with the python ecosystem. So the code to connect to the database and create it looks like this:
(ns terminusdb-post-sample.core
(:require [libpython-clj2.require :refer [require-python]] ;; Require libpython-clj deps
[libpython-clj2.python :refer [py.] :as py]))
(require-python '[terminusdb_client :as terminusdb-client]) ;; Require terminusdb_client module
(def terminusdb_server "https://cloud.terminusdb.com/")
(def team "")
(defn connect-to-terminus [server team]
;; First we need to create our WOQLClient object.
(let [client (terminusdb-client/WOQLClient server)]
;; With the client value, we can use the function py. to call the method
;; client.connect
(py. client connect :team team :use_token true)))
(defn create-db [client dbid team label description prefixes]
;; We receive the client as parameter. This is done on purpose because we don't want
;; to create a connection everytime we run a database operation.
(py. client create_database :dbid dbid
:team team
:label label
:description description
:prefixes prefixes
:include_schema false))
...
This code is pretty self-explanatory and we added a few comments as a guide to our readers.
After reading and doing some research, we noticed that the Python and Javascript libraries are basically communicating with the TerminusDB API via HTTP. So we can create something similar with Clojure using a combination of clj-HTTP and jsonista for example, and create code like this:
(ns terminusdb-post-sample.terminusdb-connector
(:require [clj-http.client :as client]
[jsonista.core :as json]))
(def terminusdb-team "")
(def terminusdb-server "https://cloud.terminusdb.com/")
(def terminusdb-api-indicator "api/")
(def terminusdb-db-resource "db")
(def terminusdb-token (System/getenv "TERMINUSDB_ACCESS_TOKEN"))
(defn send-request-to-list-dbs
[]
(let [headers {"API_TOKEN" terminusdb-token}
response (client/get (str terminusdb-server
terminusdb-api-indicator
terminusdb-db-resource)
{:headers headers})
body (:body response)]
(json/read-value body)))
(defn create-database
[db-data]
(let [headers {"API_TOKEN" terminusdb-token}
response (client/post (str terminusdb-server
terminusdb-api-indicator
terminusdb-db-resource
"/"
terminusdb-team
"/"
(:db-id db-data))
{:headers headers
:content-type :json
:body (json/write-value-as-string (apply dissoc db-data [:db-id :team-id]))})]
(:body response)))
This code block is more complicated but not rocket science either. To summarize, we are doing HTTP calls to the API exposed by TerminusDB using clj-http and decoding a map to a JSON string with jsonista. We could make some adjustments to the code to make it look more elegant, but we preferred to show this in the simplest way.
Conclusion
In this post, we showed you a little introduction to graph databases and we played a bit with a few different possibilities offered by Clojure to connect with a TerminusDB instance by using an already Python client or using native Clojure libraries.
In the next part, we will expose an interesting use case of graph databases, the rest of the code to interact with TerminusDB, and the complete implementation of our basic terminusdb-connector.
Thank you for reading this post! Be safe!