Forking with PHP from the Command Line

Page last updated on 2011 / 04 / 09

Forking is an extremely handy function in PHP you can use on UNIX platforms that has an uncountable number of uses. Its functionality can be replicated on Windows but will take up more memory.

One popular example is HTTP fetching. If you have a queue of 1000 URLs, and each URL takes 3 seconds to fetch, it will take 3000 seconds to fetch all the URLs. Slow or unresponsive servers mean that your average is higher, and that URLs later in the queue have to wait for all the slower URLs in front of it to be fetched.

With forking, you can split the workload between instances of the script. In the URL fetching example for instance, you could create 10 forks of the fetching script that will fetch 100 URLs each. This should dramatically speed up the time it takes to fetch all the URLs, because if one particular URL is slow, your 9 other forked scripts will still be fetching the URLs in their queue.

I have provided skeleton code below to give you an idea of how it can work for you.

One thing to watch for when you are working from a list of tasks to perform is to avoid a 'race condition'. For instance, if your 1000 URLs were all in a text file, without considering the race condition, all 10 forks would essentially be spidering the 1000 URLs 10 times over, so you have to consider using some logic to split the workload and avoid the race condition.

Workarounds for this problem are quite easy. In a text file for instance, you would want each script instance to grab every 10th line, so the 1st fork would grab the 1st line, the 11th line, the 21st line etc.

When grabbing the URLs from a MySQL table, add an auto increment field and use the MOD mathematical function against the increment values.

  1. <?php
  2. // save as wrapper.php
  3.  
  4. // Check basics are OK i.e. a script to execute/fork is provided
  5. isset($_SERVER['argv'][1]) or die('No wrapper vars provided');
  6. is_file($_SERVER['argv'][1]) or die('Script could not be found');
  7.  
  8. // Prepare variables that may be used in the script to execute
  9. $temp_switches = array_slice($_SERVER['argv'],2);
  10. $switches = array();
  11. $count = count($temp_switches);
  12. for($i = 0;$i < $count;$i+=2)
  13. if($temp_switches[$i][0] == '-' && isset($temp_switches[$i+1]))
  14. $switches[substr($temp_switches[$i],1)] = $temp_switches[$i+1];
  15. include_once($_SERVER['argv'][1]);
  16.  
  17. // If no fork var was passed into the script, no forking will occur
  18. if(!isset($switches['fork']))
  19. $switches['fork'] = 0;
  20.  
  21. // If you want to do some pre-processing before the script gets forked pass "-pre 1" when executing
  22. if(isset($switches['pre']))
  23. pre_cron_to_execute($switches);
  24.  
  25. if(function_exists('pcntl_fork') && $switches['fork'] > 1)
  26. {
  27. $pids = array();
  28. for($i = 0; $i < $switches['fork']; $i++)
  29. {
  30. $pids[$i] = pcntl_fork();
  31. $switches['forknumber'] = $i;
  32. if(!$pids[$i])
  33. {
  34. cron_to_execute($switches);
  35. exit(0);
  36. }
  37. }
  38. for($i = 0; $i < $switches['fork']; $i++)
  39. pcntl_waitpid($pids[$i],$status,WUNTRACED);
  40. }
  41. else
  42. {
  43. echo 'You cannot fork in this environment, or you chose not to fork the script. Executing in 5 seconds...'."\n";
  44. sleep(5);
  45. $switches['forknumber'] = 0;
  46. cron_to_execute($switches);
  47. }
  48.  
  49. // If you want to do some post-processing before the script gets forked pass "-post 1" when executing
  50. if(isset($switches['post']))
  51. post_cron_to_execute($switches);
  52. ?>
  53. <?php
  54. // save as script.php
  55.  
  56. function pre_cron_to_execute()
  57. {
  58. GLOBAL $switches;
  59. echo 'This is pre-processing, it is not forked.'."\n";
  60. sleep(3);
  61. }
  62.  
  63. function cron_to_execute()
  64. {
  65. GLOBAL $switches;
  66. echo 'This is fork #'.$switches['forknumber'].' of '.$switches['fork']."\n";
  67. }
  68.  
  69. function post_cron_to_execute()
  70. {
  71. GLOBAL $switches;
  72. echo 'This is post-processing, it is not forked.'."\n";
  73. }
  74.  
  75. ?>
  76.  

To use the script, always pass the script to execute as the first variable and "-fork x" where x is the number of times you wish to fork the script. Use "-pre 1" for pre-processing and "-post 1" for post processing.

Examples:
php wrapper.php script.php -fork 10 // fork the script 10 times
php wrapper.php script.php -fork 5 -pre 1 // fork the script 5 times with pre-processing

I use "$switches" as a global variable in each function, which at the very least is used to determine which fork number is being run in the main script.


Previous Article
HTTP Fetching in PHP Without cURL
Next Article
Simple PHP & MySQL Pagination




Tweet