I am beginning a task and I am within the creating phase: I.e., I've not made the decision yet which db framework I am likely to use. I am likely to have code that produces a "forest" like structure. That's, many trees, where each tree is really a standard: nodes and edges. Following the code produces these trees I wish to save them within the db. (after which pull them out eventually)

The naive method of representing the information within the db is really a relational db with two tables: nodes and edges. That's, the nodes table may have a node id, node data, etc.. And also the edges table is a mapping of node id to node id.

It is possible to better approach? Or because of the (limited) presumptions I am giving this is actually the ultimate way? What about when we add a belief the trees are relatively small - is it more beneficial in order to save the entire tree like a blob within the db? Which kind of db must i use for the reason that situation? Please discuss speed/scalability.


I demonstrated an answer much like your nodes &lifier edges tables, during my response to the StackOverflow question: What is the most efficient/elegant way to parse a flat table into a tree? I refer to this as solution "Closure Table".

Used to do an exhibition on different techniques of storing and taking advantage of trees in SQL, Models for Hierarchical Data with SQL and PHP. I shown by using the best indexes (with respect to the queries you have to run), the Closure Table design might have excellent performance, even over large collections of edges (about 500K edges during my demo).

I additionally covered the look in my opinion, SQL Antipatterns: Avoiding the Pitfalls of Database Programming.