advanced reports dependency injection development environment frontend editing git javascript meetup php pixlr porthole purify queuedjobs rage silverstripe tidy ubuntu webservices wiki
I've added a new piece of code to the Queued Jobs module that allows you to quickly add scheduling to any data object.
It's very straight forward to add
Object::add_extension('MyDataObject', 'ScheduledExecutionExtension');After much tweaking and stalling, the QueuedJobs module has been released (and submitted to SilverStripe.org, should be up soon). What does it provide?
The Queued Jobs module provides a framework for SilverStripe developers to define long running processes that should be run as background tasks. This asynchronous processing allows users to continue using the system while long running tasks proceed when time permits. It also lets developers set these processes to be executed in the future.
In essence, the goal is to not leave users with a seemingly 'hanging' connection that may eventually time out when they trigger an action that might take a while to process. Not many actions in SilverStripe do this, but your site might have particular need for it. Some areas where we're using it
If you want to test it out, download it from GitHub or SilverStripe.org. After extracting it to the queuedjobs folder, and running dev/build, you'll need to make sure you have a cronjob setup to run the main processor (preferably as the webserver user).
*/1 * * * * php /path/to/silverstripe/sapphire/cli-script.php dev/tasks/ProcessJobQueueTask */15 * * * * php /path/to/silverstripe/sapphire/cli-script.php dev/tasks/ProcessJobQueueTask queue=2
See the wiki page for more info.
I recommend you test out the GenerateGoogleSitemapJob to get a feel for what's going on under the covers (and it actually has a functional benefit!). To create the initial instance, go to http://path.to.silverstripe/dev/tasks/CreateDummyJob?name=GenerateGoogleSitemapJob which will create it (it will recreate itself as it processes. To make things easier, I'll step through the code so you get an idea of what's important when doing your own jobs
public function __construct() {
$this->pagesToProcess = DB::query('SELECT ID FROM "SiteTree_Live" WHERE "ShowInSearch"=1')->column();
$this->currentStep = 0;
$this->totalSteps = count($this->pagesToProcess);
}
When constructing the job, I get a list of all the Live pages on the site (these are the only ones that are going to be indexed by google) that are set to show in the search. We're only interested in the ID of these pages though, not the actual objects. This is because we're going to store the full list of IDs of the pages we need to process - the $this->pagesToProcess variable here gets serialized and stored in the database between processing events, enabling us to stop and start processing at any time.
public function getJobType() {
if ($this->totalSteps > 100) {
return QueuedJob::LARGE;
}
return QueuedJob::QUEUED;
}
Here we're arbitrarily making the judgement that > 100 pages to generate an XML file for is enough for the job to be classified as 'large'. There's no real processing difference for this at the moment; the main reason for doing so is to not clog up one of the queues with a job that will take several minutes to execute.
public function getSignature() {
return md5(get_class($this));
}
To prevent multiple instances of the same job being added to a queue, each job defines a signature. The base AbstractQueuedJob defines a default that should be good enough for 95% of jobs, but in some cases you want to ensure that a job is the only one of its kind, regardless of parameters.
public function setup() {
parent::setup();
increase_time_limit_to();
$tmpfile = tempnam(getTempFolder(), 'sitemap');
if (file_exists($tmpfile)) {
$this->tempFile = $tmpfile;
}
}
The setup() method is called just before a job starts for the first time. In this case, we're wanting to make sure that a temporary file (that we're going to build the sitemap.xml file into first) exists for us to work with.
public function prepareForRestart() {
parent::prepareForRestart();
// if the file we've been building is missing, lets fix it up
if (!$this->tempFile || !file_exists($this->tempFile)) {
$tmpfile = tempnam(getTempFolder(), 'sitemap');
if (file_exists($tmpfile)) {
$this->tempFile = $tmpfile;
}
$this->currentStep = 0;
$this->pagesToProcess = DB::query('SELECT ID FROM SiteTree_Live WHERE ShowInSearch=1')->column();
}
}
The prepareForRestart() method is executed whenever the job has been paused then restarted. It could have been restarted by a user manually pausing, or an error that caused it to stop. Either way, it gives us a chance to check the state of the job, and if necessary restart it. We could just as easily flag the job as complete here and not continue, but in this case we're making sure our temporary file still exists, and if it doesn't, creating a new one from scratch.
public function process() {
$remainingChildren = $this->pagesToProcess;
// if there's no more, we're done!
if (!count($remainingChildren)) {
$this->completeJob();
$this->isComplete = true;
return;
}
// lets process our first item - note that we take it off the list of things left to do
$ID = array_shift($remainingChildren);
// do some processing work that adds content to $tmpfile
// ... snip ...
// and now we store the new list of remaining children
$this->pagesToProcess = $remainingChildren;
$this->currentStep++;
if (!count($remainingChildren)) {
$this->completeJob();
$this->isComplete = true;
return;
}
}
The process() method is where all the actual work for this job happens, but it still needs to do a minimum of things to keep the container happy and in sync with things. First, it retrieves the list of pages still to be processed, and checks to see if there's anything left, if not marking the job complete. Next, it does the actual work with the next item in the list, then updates $this->pagesToProcess to make sure that next run through is onto the next item. It updates how many steps have been processed, then does another check to see whether the job has completed.
protected function completeJob() {
// ... snip ...
if (file_exists($this->tempFile)) {
unlink($this->tempFile);
}
$nextgeneration = new GenerateGoogleSitemapJob();
singleton('QueuedJobService')->queueJob($nextgeneration, date('Y-m-d H:i:s', time() + self::$regenerate_time));
}
Finally, our completeJob() method actually copies our temp file to the right location, then cleans up the old file. Lastly, this job creates a NEW job and adds it to the queue to be processed at a date in the future; in this case it executes once every day.
Okay, that was a whole lot of words, but hopefully gives an idea of what's involved in writing a queued job, or more to the point, what you don't have to worry about. The framework around this manages everything to do with error handling and reporting, including automatically pausing and restarting jobs and notifying on broken jobs. It manages the persistence of job state so that jobs can be picked up after they've been paused and still continue on. It also manages the scheduling of jobs in the future, so you can use the module almost as a cron replacement.