So one thing I still don't understand is whether Neo4J a pure graph database is ...

the-alchemist · on Feb 7, 2019

There's pros and cons to deciding whether to go "graph native" or existing DB.

PROS

You can optimize for exactly the types of queries that you want graph databases to answer: shortest path, path finding, etc. Relational databases / document databases are (generally) very poop at those types of queries because those are not the types of queries people want to run on those databases. In a "graph native" database, everything down to the storage on disk can be optimized to perform graph algorithms.

CONS

There's years, sometimes decades, of engineering that goes into databases (I'm thinking of PostgreSQL and Cassandra, both of which have graph "layers" available). A lot of the engineering work is non-graph specific: ACID, how to handle transactions, distributed computing, WAL, replication.

Why re-engineer all of those just to perform graph operations? More quickly.

Also, I can send you a good paper by the founder of DGraph Labs if you're really curious.

tiuPapa · on Feb 7, 2019

I would love to read the DGraph paper.

mistrial9 · on Feb 6, 2019

indexing and search specialized to graph operations is a thing; no experience with those projects, but familiar with some workarounds in Postgres. Basically, the deeper the graph searches, the more the performance drops for relational DBs. This is a seriously studied topic, so refer to research for more details

dwater · on Feb 6, 2019

If you want to do a real deep dive into the architectural differences of graph databases, the book "Designing Data-Intensive Applications" by Martin Kleppmann is a great resource. https://www.oreilly.com/library/view/designing-data-intensiv...

tiuPapa · on Feb 7, 2019

Thanks for that book recommendation.