Let’s talk about graph databases. Some industry watchers claim that they are the fastest-growing type of database. If so, maybe it’s useful to know more about them.
Starting with the basics: What is a graph database, and what is it useful for?
Here’s the short answer. Graph databases store data in vertices and edges versus tables, as found in relational databases. They are the most efficient way of looking for relationships between data items, patterns of relationships or interactions between multiple data items, while traditional relational database shine at queries looking for information about some item, or sums or averages of many items of the same type of information.
Now let’s review what a graph database isn’t. The standard type of database is the relational database – the kind built with database management systems sold by Oracle, IBM and Microsoft and others. You can think of a relational database as made up of several tables, rectangular grids of information, each one looking much like a spreadsheet. Each table can have a different number of rows and columns, and hold a different set of types of information. For example, a snippet of a company’s employee database might hold data like this:
Another table in the same database might hold information about managers:
A graph database, at least conceptually, stores its data in a different structure, a directed graph. Conceptually, directed graphs are made up of bubbles and arrows, as in this diagram:
The bubbles are called “vertices” and the arrows are called “edges.”
Data items stored in one of the fields of a relational table are, in a graph database, stored in a vertex of the graph. Data descriptors, for example “Department managed” or “Reports to” in the table just above, are stored with edges in the graph. For example, if we took our management table above and represented it in a graph database, it might look like this:
Each data item occurs only once in the graph. There is a unique “Brenda Roberts” vertex, for example. In the type of graph database Cray uses, also called a “semantic” database, each field of the relational database corresponds to a simple, subject-verb-object triple in the graph: “Jack Jones” “Reports to” “Brenda Roberts.” I threw in a little additional information that’s not in the relational table— that Brenda Roberts manages the Accounting department — just to show that each vertex may be the “subject” of some triples (“Brenda Roberts” “Department managed” “Accounting”) and the “object” of others (“Jack Jones” “Reports to” “Brenda Roberts”).
So the big difference between relational databases and graph databases is how they represent data.
Interestingly, the query languages used for each aren’t all that different. SPARQL, a standard query language for graph databases, looks a lot like SQL, the established standard for relational databases. Now, on to the important question: What is each of them good for?
More around this topic...
In the same section
© HPC Today 2020 - All rights reserved.
Thank you for reading HPC Today.