Magento collections represent a very convenient way of grouping models together. We can grab a group of models from database with little to none sql code thanks to special methods which collection model provides. Collections can get really big (new products, customers, categories, lots of attributes etc.) and become really tricky to work with. Let’s see what can be done to make large collections more manageable.
Let’s say we want to change the first name of every customer on our site to uppercase. One would probably do something like this:
$customers = Mage::getModel('customer/customer')->getCollection()->addAttributeToSelect(array('firstname'), 'inner');
foreach ($customers as $customer) {
$customer->setFirstname(strtoupper($customer->getFirstname()));
$customer->save();
}
The above code works perfectly well for smaller collections, but with large ones the script will eventually run out of memory and produce the following error:
“Fatal error: Allowed memory size of X bytes exhausted (tried to allocate X bytes) in magento_sitelibVarienDataCollection.php on line 550“.
Why is that so?
Well, collection is basically an array of objects, in this case an array of customer objects. The more objects (in this case customers) we have and the more attributes we add to selection, the bigger the resulting array will be, which leads to high memory usage.
Solution to this kind of problem is implemented in Magento in the form of Mage_Core_Model_Resource_Iterator model. It enables us to grab data from database one by one, which is in contrast to loading all results at once like we did in example above. Main part of the iterator is method walk() which needs two mandatory parameters to work, collection query string and callback method(s).
If we look at the method implementation:
public function walk($query, array $callbacks, array $args=array(), $adapter = null)
{
$stmt = $this->_getStatement($query, $adapter);
$args['idx'] = 0;
while ($row = $stmt->fetch()) {
$args['row'] = $row;
foreach ($callbacks as $callback) {
$result = call_user_func($callback, $args);
if (!empty($result)) {
$args = array_merge($args, $result);
}
}
$args['idx']++;
}
return $this;
}
we can see that it takes provided query string, executes it, fetches results one by one and sends them to the callback method. Result which is passed to callback method is placed in an array $args, which holds data for each object in collection. In callback function we can access the data with $args[‘row’].
Here is what our example would look like using iterator:
public function uppercaseAction()
{
// get customer collection
$customers = Mage::getModel('customer/customer')->getCollection()->addAttributeToSelect(array('firstname'), 'inner');
// call iterator walk method with collection query string and callback method as parameters
Mage::getSingleton('core/resource_iterator')->walk($customers->getSelect(), array(array($this, 'customerCallback')));
}
// callback method
public function customerCallback($args)
{
$customer = Mage::getModel('customer/customer’); // get customer model
$customer->setData($args['row']); // map data to customer model
$customer->setFirstname(strtoupper($customer->getFirstname())); // set value of firstname attribute
$customer->getResource()->saveAttribute($customer, 'firstname'); // save only changed attribute instead of whole object
}
Things are fairly simple here. We define collection and pass its sql query to walk method of iterator. Iterator than executes given query and calls customerCallback() method on each row returned from database. In callback method we instantiate customer model, map data from $args[‘row’] to the model and change its firstname property to uppercase.
One thing to take care about is object saving. Since we are saving lots of models we want our script to be as efficient as possible. Using “classic” save on model is pretty heavy operation which takes time and resources. Since we are changing only one attribute it is best to use saveAttribute() method which is much faster.
Thats it, no more “out of memory” error :).
Happy coding!