Last Updated:

Work with MySQL - Trees

The need to display data structured in the form of trees arises when writing your own forum or site directory. Ready-made catalogs and forums on the network can be found in plenty, but sometimes someone else's in the finished is not suitable, and it will take much more time for others to redo what is written than to write your own.

It is better to take the generally accepted data structure - the post of the message or the forum column contains the identifier of the parent. To organize the output of the tree, a recursive function is suggested. That's what Phorum [http://phorum.org] did. The include/multi-threads.php file contains a thread function that builds a call for each root message and recursively calls itself to respond to them:

function thread ($seed = 0) {
...

    if(@is_array($messages[$seed]["replies"])) {
        $count = count($messages[$seed]["replies"]);
        for($x = 1;$ x

But calling a recursive function on output raises my doubts: it is irrational to repeat the construction of the message tree on each output. The structure of the tree changes only when messages are added, modified, or deleted. It would be better to call this procedure during such actions, store the structure in a table and not do any calculations when displaying the tree.

To build a tree, it is enough to know the order in which rubrics are displayed and their level in the tree. Let's add two fields with this data to the table: level (TINYINT(4). Is 127 levels enough?) and sortorder (VARCHAR(128)).

All we need to build the tree is the ID of the rubric and its parent. Suppose we have several categories in the catalog with the following content:

---------
id parent
---------
 3      0
 5      0
 7      0
10      3
11      7
12      5
13      3
16     10
21     16
26     11
30      3
47      7
60     10
73     13
75     47
---------

The tree structure we want to look like is:

o- 3
|
+-o- 10
| |
| +-o- 16
| | |
| | +-o- 21
| |
| +-o- 60
|
+-o- 13
| |
| +-o- 73
|
+-o- 30

o- 5
|
+-o- 12

o- 7
|
+-o- 11
| |
| +-o- 26
|
+-o- 47
  |
  +-o- 75

True, this algorithm will allow you to draw a tree, but without branches in the form of lines, as is done in this figure. The tree structure will be drawn using the indents on the left.

Let's return once again to the id-parent table. These are rubrics already sorted according to some criterion by which we want to sort elements of the same level. For example, in descending order of the number of sites. In addition to the id and the parent rubric, we also know the number of each of them in this list. Align these numbers to the desired length by adding zeros to the left. After that, for each rubric, we will make a text string with the numbers of all its parents from the root itself:

------------------------------
id sort parent sortorder level
------------------------------
 3    1      0 01            0
 5    2      0 02            0
 7    3      0 03            0
10    4      3 0104          1
11    5      7 0305          1
12    6      5 0206          1
13    7      3 0107          1
16    8     10 010408        2
21    9     16 01040809      3
26   10     11 030510        2
30   11      3 0111          1
47   12      7 0312          1
60   13     10 010413        2
73   14     13 010714        2
75   15     47 031215        2
------------------------------

When sorting by the sortorder field, we get exactly what we need:

------------------------------
id sort parent sortorder level
------------------------------
 3    1      0 01            0
10    4      3 0104          1
16    8     10 010408        2
21    9     16 01040809      3
60   13     10 010413        2
13    7      3 0107          1
73   14     13 010714        2
30   11      3 0111          1
 5    2      0 02            0
12    6      5 0206          1
 7    3      0 03            0
11    5      7 0305          1
26   10     11 030510        2
47   12      7 0312          1
75   15     47 031215        2
------------------------------

The indentation on the left is given by the level field.

It's also important to note that we don't need to sort anything in the script itself. To form the sortorder and level fields, you need to lock the table from writing (so that the structure of the branches does not change), select the rubric and its parent identifier from the database, sorting according to the desired attribute, and write them into a simple two-dimensional array. Then process the array sequentially from the first to the last level and write the sortorder and level fields to the table.

No recursion is needed to form the sortorder (although it is possible, and probably even faster). It is enough to go through the array with the same cycle. In it, if the rubric is not processed, the sortorder field is formed for it from the sort field and the parent sortorder. If the parent rubric has not yet been processed, the $unprocessed_rows_exist flag is raised and the loop is run again.

mysql_query("LOCK TABLES dir WRITE");

$result = mysql_query("SELECT id, IFNULL(parent,0) as parent \
                           FROM dir ORDER BY sites DESC, title");

while ($row = mysql_fetch_array($result)) {
    $count++;
    $data["parent"][$row["id"]] = $row["parent"];
    $data["sort"][$row["id"]] = $count;
}

reset($data);

$unprocessed_rows_exist = true;
while($unprocessed_rows_exist) {
    $unprocessed_rows_exist = false;
    while (list($i, $v) = each($data["parent"])) {
        if(($data["parent"][$i] == 0 || !isset($data["sort"][$data["parent"][$i]])) &&
          !isset($data["sortorder"][$i])) {
            $data["sortorder"][$i] = str_pad($data["sort"][$i], $max,
            "0", STR_PAD_LEFT);
            $data["level"][$i] = 0;
        }
        elseif(!isset($data["sortorder"][$i]) &&
                isset($data["sortorder"][$data["parent"][$i]])) {
            $data["sortorder"][$i] = $data["sortorder"][$data["parent"][$i]].
              str_pad($data["sort"][$i], $max, "0", STR_PAD_LEFT);
            $data["level"][$i] = $data["level"][$data["parent"][$i]] + 1;
        }
        elseif(!isset($data["sortorder"][$i]) &&
            isset($data["sort"][$data["parent"][$i]])) {
            $unprocessed_rows_exist = true;
        }
    }

reset($data);

I note that this algorithm does not loop in the presence of rows with a broken parent field and does not skip them, but makes them root. The recursive algorithm will simply skip them.

After executing this loop, we have "id => level" and "id => sortorder" arrays. We send only one query to the database, using the internal functions of MySQL ELT and FIND_IN_SET:

mysql_query("UPDATE dir SET sortorder=ELT(FIND_IN_SET(id,'".
  implode(",", array_keys($data["sortorder"])).
  "'),". implode(",", $data["sortorder"]).
  "), level=ELT(FIND_IN_SET(id,'".
  implode(",", array_keys($data["level"])).
  "'),". implode(",", $data["level"]).
  ") WHERE id IN (".
  implode(",", array_keys($data["sortorder"])).
  ")");

Of course, unlike the category tree of the directory, there are a lot of messages in a large forum. It makes no sense to distort them all when adding one new one.

In forums, sorting by the date the message was written is most often used. Therefore, the sortorder field can be safely made from its own and parent timestamp, aligned with the str_pad function [http://www.php.net/manual/en/function.str-pad.php] to 11-digit length.