Creating a Directory Tree with PHP & MySQL (Part 1)

Page last updated on 2011 / 04 / 09

Hierarchy structures are commonplace for websites and relational databases, a popular example would be the classic "web directory", where items are stored in a directory tree and users click into the category structure to find items of interest to them.

A classic problem of directory design is how to store the relationship between categories. Having pairs of id/parentid in a table is a simple solution, and effective enough for directories with a shallow depth, but what about for larger structures, such as the DMOZ directory dump?

Some reading material on SQL trees may be of interest to you if you're considering building a large hierarchy.

On the PHP side of things, Kevin van Zonneveld has created a nice little explodeTree function to represent data in a multi-dimensional array.

To prevent this tutorial becoming huge, I'd recommend reading the first link in particular to get to grips with the whys and why nots of using tree structures in SQL. Some more bedtime reading about nested sets and data hierarchy can be found at code project.

An Example of Tree Structures in Action

The following script will generate a tree structure and store it in MySQL. It's quite lengthy, but it should give you a good visualisation of how to take advantage of tree structures and is easy enough to adapt for your own needs.

This first part of a 2 part post gives you the script enabling you to create and store directory structures. The notes at the foot of the page should guide you through the script in the event you run into problems or are interesting in tinkering with it for your own requirements. The 2nd post gives a basic GUI and admin functions for viewing and altering the tree structure.

If I understand the lingo correctly (it can get a little heavy) this script is a 'preorder tree traversal' which can bring about a new way of looking at and extrapolating your data.

Save and run the following script to generate some example data. It is assumed you have MySQL installed and already have created a database called 'test'. Download this 300K list of categories (1.7MB) as sample data to work with

buildtree.php

  1. <?php
  2. // buildtree.php
  3.  
  4. mysql_connect('localhost','root','root') or die('Cant connect to MySQL');
  5. mysql_select_db('test') or die('Cant connect to MySQL');
  6.  
  7. /*
  8. Using the dmozregional.txt file on innvo.com, otherwise you can pass a different file or even an array
  9. Contains around 314,000 categories
  10. Will take about 40 seconds to process, ensure you have enough memory (around 200MB in this case)
  11. and roughly 10x the size of your input file in general cases
  12. */
  13. $tree = new generate_tree(fopen('dmozregional.txt','r'));
  14. $tree->to_mysql_data();
  15.  
  16. class generate_tree {
  17. var $cats = array();
  18. var $thiscat = array();
  19. var $depths = array();
  20. var $lftrgt = array();
  21. var $depth = 0;
  22. var $inc = 0;
  23. var $catid = 0;
  24. var $fp1,$fp2;
  25. /*
  26. Run generate_tree once if your dataset is small or memory is not an issue.
  27. */
  28. public function generate_tree(&$linearcats)
  29. {
  30. $this->depth = 0;
  31. $this->cats = array();
  32. echo "Step 1: Gathering Data: ".date('H:i:s')."\n";
  33. if(is_array($linearcats)) // Run through array
  34. {
  35. foreach($linearcats as $cat)
  36. {
  37. $this->cats[$cat] = array(); // Adding to 2 dimension list of categories with $cat as the key
  38. array_shift($linearcats);
  39. }
  40. }
  41. elseif(is_resource($linearcats))
  42. {
  43. while(!feof($linearcats))
  44. {
  45. if(!trim($cat = trim(fgets($linearcats))))
  46. continue;
  47. $this->cats[$cat] = array(); // Adding to 2 dimension list of categories with $cat as the key
  48. }
  49. }
  50.  
  51.  
  52. if(!is_resource($this->fp1)) // 1st Pass, open files
  53. {
  54. $this->fp1 = fopen('/tmp/tree.txt','w');
  55. $this->fp2 = fopen('/tmp/top.txt','w');
  56. }
  57. echo "Step 2: Tree Structure: ".date('H:i:s')."\n";
  58. $this->cats = $this->explodeTree($this->cats);
  59. echo "Step 3: Pre-Order Tree Traversal: ".date('H:i:s')."\n";
  60. $this->mptt($this->cats);
  61. }
  62.  
  63. /********************************
  64. Function explodeTree with thanks to Kevin van Zonneveld
  65. info: http://kevin.vanzonneveld.net/techblog/article/convert_anything_to_tree_structures_in_php/
  66. Altered from original version but serves largely the same purpose
  67. Example input data is same/similar as example data in the above URL
  68. ********************************/
  69.  
  70. public function explodeTree($array) {
  71.  
  72. if(!is_array($array) || !count($array))
  73. return array();
  74. $returnArr = array();
  75. foreach ($array as $key => $val)
  76. {
  77. // Get parent parents and the current leaf
  78. $parents = preg_split("'/'",$key,-1,PREG_SPLIT_NO_EMPTY);
  79. $leaf = array_pop($parents);
  80.  
  81. // Build parent structure
  82. // Might be slow for really deep and large structures
  83. $parentArr = &$returnArr;
  84. foreach($parents as $parent)
  85. {
  86. if(!isset($parentArr[$parent]))
  87. $parentArr[$parent] = array();
  88. elseif(!is_array($parentArr[$parent]))
  89. $parentArr[$parent] = array();
  90. $parentArr = &$parentArr[$parent];
  91. }
  92. // Add the final part to the structure
  93. if(empty($parentArr[$leaf]))
  94. $parentArr[$leaf] = $val;
  95. elseif(is_array($parentArr[$leaf]))
  96. $parentArr[$leaf][] = $val;
  97. }
  98. return $returnArr;
  99. }
  100. /********************************
  101. Function mptt (modified pre-order tree traversal)
  102. Used to recursively walk through the array and provide lft and rgt values (and category depth)
  103. ********************************/
  104. public function mptt(&$cats) {
  105. foreach($cats as $catname => $array)
  106. {
  107. $this->depths[$this->depth] = 0;
  108. $this->thiscat[$this->depth] = $catname; // Marking this depth of categories as this category
  109. $imp = implode('/',$this->thiscat); // Full category path
  110. $this->lftrgt[$imp] = array(++$this->inc,++$this->catid); // Assign lft
  111. if(count($array))
  112. {
  113. ++$this->depth; // Deeper
  114. $this->mptt($array); // Reiterate
  115. }
  116. fwrite($this->fp1,$this->lftrgt[$imp][0]."\t".(++$this->inc)."\t".count($this->thiscat)."\t".$this->lftrgt[$imp][1]."\t".strtoupper(md5($imp))."\t".$catname."\n"); // $lft $rgt $depth $id $hash $name
  117. if($this->depth == 0)
  118. fwrite($this->fp2,$this->lftrgt[$imp][1]."\n");
  119. unset($this->lftrgt[$imp]);
  120. }
  121. --$this->depth; // Shallower
  122. array_pop($this->thiscat); // Pop this category depth as we're moving up from here
  123. }
  124.  
  125. /********************************
  126. Function mysql_data
  127. Creates schema, deletes any old data and creates indexes if not already made.
  128. Deletes temporary file data that MySQL uses to load data into tables
  129. ********************************/
  130. public function to_mysql_data() {
  131. echo "Step 4: MySQL Schema: ".date('H:i:s')."\n";
  132. // Creating DB Schema and populating with data. Deleting any old data if present
  133.  
  134. mysql_query('CREATE TABLE IF NOT EXISTS category_tree (
  135. lft mediumint(8) unsigned NOT NULL,
  136. rgt mediumint(8) unsigned NOT NULL,
  137. depth tinyint(3) unsigned NOT NULL,
  138. id mediumint(8) unsigned NOT NULL,
  139. hash binary(16) NOT NULL,
  140. name varbinary(255) NOT NULL)
  141. ENGINE=InnoDB;')
  142. or die(mysql_error());
  143. mysql_query('TRUNCATE TABLE category_tree')
  144. or die(mysql_error());
  145. mysql_query('LOAD DATA INFILE \'/tmp/tree.txt\' INTO TABLE category_tree FIELDS TERMINATED BY \'\t\' (lft,rgt,depth,id,@hash,name) SET hash = UNHEX(@hash)')
  146. or die(mysql_error());
  147. mysql_query('INSERT INTO category_tree (lft,rgt,depth,id) SELECT 0,MAX(rgt)+1,0,0 FROM category_tree') or die(mysql_error());
  148.  
  149. mysql_query('CREATE TABLE IF NOT EXISTS category_top (id MEDIUMINT(8) unsigned NOT NULL,PRIMARY KEY (id)) ENGINE=MyISAM')
  150. or die(mysql_error());
  151. mysql_query('TRUNCATE TABLE category_top')
  152. or die(mysql_error());
  153. mysql_query('LOAD DATA INFILE \'/tmp/top.txt\' INTO TABLE category_top FIELDS TERMINATED BY \'\t\'')
  154. or die(mysql_error());
  155.  
  156. // Delete temporary data
  157. unlink('/tmp/tree.txt');
  158. unlink('/tmp/top.txt');
  159.  
  160. echo "Step 5: MySQL Indexes: ".date('H:i:s')."\n";
  161. // Adding indexes for SELECT speed. Script will die appropriately here if you have already created the indexes
  162. mysql_query('ALTER TABLE category_tree ADD PRIMARY KEY (lft,rgt,depth);') or die('I1 '.mysql_error());
  163. mysql_query('ALTER TABLE category_tree ADD UNIQUE (id);') or die('I2 '.mysql_error());
  164. mysql_query('ALTER TABLE category_tree ADD UNIQUE (hash);') or die('I3 '.mysql_error());
  165. }
  166. }
  167.  
  168. ?>

Script Overview

Part 2 of this tutorial looks at the tree structure in action, and how to perform basic modifications of the tree.


Previous Article
A Simple PHP .htpasswd Manager
Next Article
Creating a Directory Tree with PHP & MySQL (Part 2)




Tweet