Enhanced export – collection to a file

Don’t let this tittle mislead you. This post is beginners guide that will explain how Magento’s method getCsvFile() works. We’ll explain some properties and advice you with some tips how to use some of the properties.
Probably you’re familiar with the error: “Fatal error: Allowed memory size of X bytes exhausted (tried to allocate Y bytes)…”. You’ll probably experience it if you don’t care about performance/memory limit sooner or later. If you want to know how to additionally avoid that error message, keep reading.
If you’re relatively new to Magento and if you didn’t already read Part 1., please review it before this post because we will not go here in deep.
In previous post (Part 1.) I wrote:
“In our next step we’ll modify call to $this->_exportIterateCollection(‘_exportCsvItem’, array($io));, but more accurately we’ll create new getCsvFile() method. Let’s call new method getCsvFileEnhanced() just to see how can we extend existing method…”.
Note that we can have 2 different implementation. One is to rewrite (extend) required block and second implementation could be to create model with required methods that doesn’t need to extend anything, just pure Model.
So let’s start with implementation – “rewrite” approach.
1. Create module structure and tell Magento about it
2. Create config.xml with block and controller rewrite
3. Create Block file
4. Create Controller file
5. We’re done!
1. Create module structure and tell Magento about it
Create file app/etc/modules/Inchoo_EnhancedExport.xml with a following content:
<?xml version="1.0"?>
<config>
<modules>
<Inchoo_EnhancedExport>
<active>true</active>
<codePool>local</codePool>
</Inchoo_EnhancedExport>
</modules>
</config>
2. Create config.xml with block and controller rewrite
Create file app/code/local/Inchoo/EnhancedExport/etc/config.xml with a following content:
<?xml version="1.0"?>
<config>
<modules>
<Inchoo_EnhancedExport>
<version>1.0.0.0</version>
</Inchoo_EnhancedExport>
</modules>
<global>
<blocks>
<adminhtml>
<rewrite>
<sales_order_grid>Inchoo_EnhancedExport_Block_Mage_Adminhtml_Sales_Order_Grid</sales_order_grid>
</rewrite>
</adminhtml>
</blocks>
</global>
<admin>
<routers>
<adminhtml>
<args>
<modules>
<Inchoo_EnhancedExport before="Mage_Adminhtml">Inchoo_EnhancedExport_Mage_Adminhtml</Inchoo_EnhancedExport>
</modules>
</args>
</adminhtml>
</routers>
</admin>
</config>
As you can see we are rewriting one block and one controller. When I’m rewriting some core functionality I’m always tring to keep Magento core path inside of my module, as you’ll see in next section.
3. Create Block file
Create file app/code/local/Inchoo/EnhancedExport/Block/Mage/Adminhtml/Sales/Order/Grid.php with a following content:
<?php
class Inchoo_EnhancedExport_Block_Mage_Adminhtml_Sales_Order_Grid extends Mage_Adminhtml_Block_Sales_Order_Grid
{
/**
* Rows per page for import
*
* @var int
*/
protected $_exportPageSize = 500;
public function __construct()
{
parent::__construct();
}
protected function _prepareColumns()
{
$this->addExportType('*/*/exportCsvEnhanced', Mage::helper('sales')->__('CSVe'));
return parent::_prepareColumns();
}
public function getCsvFileEnhanced()
{
$this->_isExport = true;
$this->_prepareGrid();
$io = new Varien_Io_File();
$path = Mage::getBaseDir('var') . DS . 'export' . DS; //best would be to add exported path through config
$name = md5(microtime());
$file = $path . DS . $name . '.csv';
/**
* It is possible that you have name collision (summer/winter time +1/-1)
* Try to create unique name for exported .csv file
*/
while (file_exists($file)) {
sleep(1);
$name = md5(microtime());
$file = $path . DS . $name . '.csv';
}
$io->setAllowCreateFolders(true);
$io->open(array('path' => $path));
$io->streamOpen($file, 'w+');
$io->streamLock(true);
$io->streamWriteCsv($this->_getExportHeaders());
//$this->_exportPageSize = load data from config
$this->_exportIterateCollectionEnhanced('_exportCsvItem', array($io));
if ($this->getCountTotals()) {
$io->streamWriteCsv($this->_getExportTotals());
}
$io->streamUnlock();
$io->streamClose();
return array(
'type' => 'filename',
'value' => $file,
'rm' => false // can delete file after use
);
}
public function _exportIterateCollectionEnhanced($callback, array $args)
{
$originalCollection = $this->getCollection();
$count = null;
$page = 1;
$lPage = null;
$break = false;
$ourLastPage = 10;
while ($break !== true) {
$collection = clone $originalCollection;
$collection->setPageSize($this->_exportPageSize);
$collection->setCurPage($page);
$collection->load(/*true, true*/);
if (is_null($count)) {
$count = $collection->getSize();
$lPage = $collection->getLastPageNumber();
}
if ($lPage == $page || $ourLastPage == $page) {
$break = true;
}
$page ++;
foreach ($collection as $item) {
//$item->setState($item->getState(), 'processing'); $item->save();
call_user_func_array(array($this, $callback), array_merge(array($item), $args));
}
}
/*exit();*/
}
}
Let me first explain meaning of protected $_exportPageSize = 500;. This is similar flag as you can set it in administration when you’re in some grid. You can always set more “Views” per page. Magento because of memory improvement added this flag so it doesn’t need to load entire collection in memory and then work with it (imagine one billion of records,…). If you have a web shop that has lots of orders,… per day and if you decide some day to export entire collection of orders, you could experience memory limit error.
So this is a great flag that Magento core team gave us but that’s not always enough and I’ll tell you soon why.
Next, our method getCsvFileEnhanced() is very similar to core method getCsvFile(). I only added one “WHILE” block while (file_exists($file)) {…, call to $this->_exportIterateCollectionEnhanced instead of $this->_exportIterateCollection and I set ‘rm’ => false // can delete file after use in return array. That’s pretty much all we did here. Now let’s look _exportIterateCollectionEnhanced and then I’ll explain why you should be careful with only _exportPageSize property.
First thing that we added in _exportIterateCollectionEnhanced() method is $ourLastPage = 10; and second thing that we added is in “IF” block if ($lPage == $page || $ourLastPage == $page) {. Now probably you can guess why I added additional “stop” in our collection.
In default Magento, you’ll probably have $_exportPageSize = 1000;. What Magento does from a whole collection is to break a whole collection in smaller chunks (1000 by 1000). So Magento will grab first 1000 records and then it will process them and save those 1000 records in a file. Then it will grab next 1000 records and do the same until process end.
If you don’t trust me ( 🙂 )you can in index.php file enable display_errors(1) and remove comments that are in “/*..*/ (change $collection->load(/*true, true*/); to $collection->load(true, true); and /*exit();*/ to exit(); after you finish with this post, because we didn’t yet override sales_order controller).
Sure, sometimes you’ll set a cron to export your orders. Maybe you don’t want to stop export after 10 “pages” because in next iteration of cron, it might export same records. Or if you want, then you’ll need “remember” page on which Magento stops last time so you can start from there next time,…
Additionally, often you want to change e.g. order status automatically (via cron job) after exporting to some shipping distribution your orders. For that case I left commented line /$item->setState($item->getState(), ‘processing’); $item->save(); so you can use it for your needs… In “5. step” – We’re done! we’ll export our orders and change status to processing.
So let’s create our controller file.
4. Create Controller file
Create file app/code/local/Inchoo/EnhancedExport/controllers/Mage/Adminhtml/Sales/OrderController.php with a following content:
<?php
$defController = Mage::getBaseDir()
. DS . 'app' . DS . 'code' . DS . 'core'
. DS . 'Mage' . DS . 'Adminhtml' . DS . 'controllers'
. DS . 'Sales' . DS . 'OrderController.php';
require_once $defController;
class Inchoo_EnhancedExport_Mage_Adminhtml_Sales_OrderController extends Mage_Adminhtml_Sales_OrderController
{
/**
* Export order grid to CSVi format
*/
public function exportCsvEnhancedAction()
{
$fileName = 'orders-' . gmdate('YmdHis') . '.csv';
$grid = $this->getLayout()->createBlock('adminhtml/sales_order_grid');
$this->_prepareDownloadResponse($fileName, $grid->getCsvFileEnhanced());
}
}
Here you can see that we are calling our newly created method getCsvFileEnhanced(), on our sales_order_grid and we added gmdate as suffix in name of our orders-***.csv file.
5. We’re done!
For end, let’s remove comment from //$item->setState($item->getState(), ‘processing’); $item->save(); – we could load from config our new order status. Because we know that we’ve rewritten our sales_order block, we know that our $item represent order object. Run your export CSVe and you’ll see that all of your orders should have “new” status. Usually you’ll only SELECT e.g. processing orders and mark them as e.g. shipped after you (cron) run some export…
I hope you’ve learned something new and in future you’ll be more careful about memory constraint when exporting collections!
12 comments
Dopo circa 4 ore di prove, ho capito perchè non esportava tutta la ‘collection’;
dopo aver clonato la collection
$collection = clone $originalCollection;
bisogna richiamare il metodo clear()
$collection->clear();
When I exported CSV with above code then one extra blank row at top is displaying and header started from row 2.
Great article, thanks for this. I managed to write my own little module to export a selection of orders that should be shipped and will implement it as a cron-job to export, then my local computer can download the file and send it to my labelprinter to print shipping labels; should be neat if I get it working.
However, the final part of the tutorial left me somewhat confused:
I got that code from this stack
After this the order should be invoiced and automatically set to ‘processed’.
I would like to be able to do that as well but with some variance: 1. store operator chooses when it happens (they often check for new orders after a couple of days), 2. the action exports the orders accompanied with 2 other files (customers.txt, products.txt) to a web directory which then gets downloaded by a processing app that merges the info from the 3 files.
Would be grateful if you would update what you figure out, M. Thanks!
Hi Ivan,
Thanks for this tutorial. I was checking the code but I can’t see any advantage with your method over the standard Magento method. As soon as the amount of rows in your collection gets too big PHP will still throw an out of memory error. Even with $_exportPageSize = 100; and $ourLastPage = 100; (I have a collection of 5.000+ products of which I would like to export a custom stock take).
The problem is that the loop is still within the same request so there’s no advantage in memory usage as far as I understand.
Wouldn’t a batched export be better?
how to sort by specific column?
i need to product review export. how it make?
Is there any way to export cms pages both active and inactive?
And i also need to import them to other server or website.
I found there is another module (comunity) which override Mage_Adminhtml_Sales_OrderController. If I switch off this module your code is working fine.
What are the permissions of the files? Maybe the problem in files permissions…
Same problem as Frank on Magento 1.7.0.2.
Anyone can help?
Hi Ivan,
Thanks a lot for the tutorial. It’s awesome.
But I seem to have a problem with the controller.
I get my new CSV option in the backend, but when I press the export button I get the 404 error.
So I would assume for some reason my controller-code that should extend the original Magento controller is not loaded.
I checked the directory paths many times, checked filename/directory case and everything looks correct.
I don’t see why it would not find the exportCsvEnhancedAction
Any idea why this could be? I use Magento 1.7.0.1