I've got a database table (sqlite) that contains products that form a tree hierarchy. The items comes with an id area (by itself) along with a parentId because of its parent. Now given a product, I have to retrieve the entire chain in the root towards the item.

Essentially the formula in pseudocode appears like:

  1. cursor is item
  2. retrieve parentItem for cursor by parentId
  3. if parentItem isn't rootItem, then cursor = parentItem and goto 2.

So I must perform an SQL Choose query for every item.

Can you really retrieve the entire chain rootItem -> ... -> item by carrying out just one SQL query?

You will find plenty of creative methods for organizing hierarchial data inside a database, but consistently I've found it simplest restore the information in non-hierarchial format, then complement parent and child records programmatically.

Total quantity of effort: 1 query + 1 programmatic go through your dataset to produce the hierarchy.


Alternative approach:

I have used this process previously with limited success. You are able to keep path of every item inside your tree utilizing a varchar(max) column the following:

ID    ParentID    Path
--    --------    ----
1     null        1/
2     1           1/2/
3     null        3/
4     2           1/2/4/
5     4           1/2/4/5/
6     null        6/
7     5           1/2/4/5/7/
9     5           1/2/4/5/9/

From there, getting all the nodes under ID = 5 is an extremely simple:

SELECT *
FROM table
WHERE Path like (SELECT Path FROM Table WHERE ID = 5) + '%'

Avoid ANSI standard SQL it is not, no. Well, that isn't strictly true. That you can do left outer joins and set in enough to pay for the likely maximum depth but unless of course you restrain the max depth and can include that lots of joins, it will not always work.

In case your group of rows is sufficiently little (say under 1000), just retrieve all of them after which decipher it. It will be faster than single read traversals in all probability.

You can batch parents traversal. Possess a query like:

SELECT t1.id id1, t1.parent parent1,
       t2.id id2, t2.parent parent2,
       t3.id id3, t3.parent parent3,
       t4.id id4, t4.parent parent4,
       t5.id id5, t5.parent parent5
FROM mytable t1
LEFT OUTER JOIN mytable t2 ON t1.parent = t2.id
LEFT OUTER JOIN mytable t3 ON t2.parent = t3.id
LEFT OUTER JOIN mytable t4 ON t3.parent = t4.id
LEFT OUTER JOIN mytable t5 ON t4.parent = t5.id
WHERE t1.id = 1234

and extend it to whatever number you would like. When the last retrieved parent is not null you are not towards the top of the tree yet so run the query again. By doing this you need to hopefully reduce it to at least one-2 roundtrips.

Apart from that you could think about methods for encoding that data within the ID. This is not suggested but when you limit, say, each node to getting 100 children you can state that node by having an ID 10030711 has path of 10 -> 03 -> 07 -> 11. Those of course has other issues (like max ID length) not to mention it's hacky.

It is also worth observing that you will find two fundamental models for hierarchical data in SQL. Adjacency lists and nested sets. The right path (that is pretty common) is definitely an adjacency set. Nested sets wouldn't help much with this particular situation though and they're complicated to complete card inserts on.

is it possible to alter the table structure? Appears like storing right and left nodes could be simpler to utilize than storing only a parent because a single choose can be done. Begin to see the following links:

http://www.mail-archive.com/sqlite-users@sqlite.org/msg23867.html

http://weblogs.asp.net/aghausman/archive/2009/03/16/storing-retrieving-hierarchical-data-in-sql-server-database.aspx (this really is SQLServer, but there is a diagram that can help.)