Comparing relational to graph database
This page explores the conceptual differences between relational and graph database structures and data models. For a comparison between query languages, see Comparing Cypher® with SQL.
Relational database overview
Relational databases store highly-structured data in tables with predetermined columns and rows of specific types of information. Due to the rigidity of their organization, relational databases require developers and applications to strictly structure the data used in their applications.
In relational databases, references to other rows and tables are indicated by referring to primary key attributes via foreign key columns.
JOIN
s are computed at query time by matching primary and foreign keys of all rows in the connected tables.
These operations are compute-heavy and memory-intensive, and have an exponential cost.
When many-to-many relationships occur in the model, you must introduce a JOIN
table (or associative entity table) that holds foreign keys of both the participating tables, further increasing join operation costs:
The diagram shows the concept of connecting an Employee
(from the Employee
table) to a Department
(in the Departments
table) by creating a Dpt_Members
join table that contains the ID of the employee in one column and the ID of the associated department in another column.
This structure makes understanding the connections cumbersome, because you must know the Employee
and the Department
ID values (performing additional lookups to find them) in order to know which employee connects to which department.
Additionally, these types of costly JOIN
operations are often addressed by denormalizing the data to reduce the number of JOIN
s necessary, therefore breaking the data integrity of a relational database.
Graph databases offer other ways to connect data.
Translating relational knowledge to graphs
Unlike other database management systems, relationships are of equal importance to the data itself in a graph data model. This means you are not required to infer connections between entities using special properties such as foreign keys or out-of-band processing like map-reduce.
By assembling nodes and relationships into connected structures, graph databases enable building models that map closely to a problem domain.
With Cypher’s equivalent of a JOIN
operation, the graph database can directly access the connected nodes and eliminate the need for expensive search-and-match computations.
This ability to pre-materialize relationships into the database structure allows Neo4j to provide improved performance compared to others, especially for join-heavy queries.
Data model differences
The data models for relational and graph databases are vastly different, as a result of the structural differences previously described. The graph model needs to consider access requirements, expected queries and performance, as well as business logic.
For example, if you want to know which departments Alice belongs to, this is how a relational and a graph databases structure the same data:
In the relational example, on the left, you need to:
-
Search the
Employees
table (potentially with thousands of rows) to find the user Alice and her ID of 815. -
Search the
Dept_Members
table to locate all the rows that reference Alice’s ID of 815. -
Once the 3 relevant rows are found, you go for the
Departments
table to search for the actual values of the department IDs (111, 119, 181). -
Only now you know that Alice is part of the 4Future, P0815, and A42 departments.
In the graph version, you need to:
-
Search for Alice’s
Employee
node. -
Traverse all of the
BELONGS_TO
relationships from Alice and find theDepartment
nodes she is connected to.
If you want to learn how to create a data model, follow the Tutorial: Create a graph data model or see how to adapt an existing project with a relational model to a graph on Modeling: relational to graph.
Data storage and retrieval
SQL is a query language used to query relational databases. Cypher is Neo4j’s declarative query language built on the basic concepts and clauses of SQL, but with additional functionalities that make working with graph databases more efficient.
For example, when writing an SQL statement with a large number of JOIN
s, you can quickly lose sight of what the query actually does, since there is a lot of technical noise in SQL syntax.
In Cypher, the syntax remains concise and focused on domain components and their connections, thus expressing the pattern to find or create data more visually and clearly.
Other clauses outside of the basic pattern matching still look very similar to SQL, as Cypher was built on the predecessor language’s foundation. You can see the similarities and differences Comparing Cypher with SQL.
Keep learning
-
Import: RDBMS to graph → Learn how to import data from a relational database to a graph.
-
Modeling: relational to graph → Find more comparisons between relational and graph data modeling.
-
The Definitive Guide to Graph Databases for the RDBMS Developer → Download the free e-book.