Magento tip: Get a small sized version of a large production database

[Guest post] Sooner or later, while working, every developer needs to deal with uniform tasks that arise every now and then. The majority of our clients usually only need minor changes in their store’s appearance or bug fixing. Working for such clients typically requires a smaller amount of work. With this type of client, the process of ‘overtaking’ the project needs to be as quick as possible.
NOTE: This is a guest post by our associate and a great Magento developer Tsvetan Stoychev – you can learn more about him and his business over here, and now – enjoy the article!
In our team, I am the one in charge of transferring an existing Magento store to our local development environment and of distributing the project among my colleagues.
Basically, what I need to do is:
- Transfer the store’s source code from the production server to our local environment
- Download any media files and share them with the team members
- Dump the database and share the database file with the team members
- Create a code repository and an issue tracking system.
Today I will elucidate the types of problems I encounter in the process of dumping and database distribution, and what I can do to expedite this process.
Normally, the project that we have to overtake is a Magento store, which has been active for more than a year. In most cases, the database of a store like this exceeds 1 GB. It turns out that the greater part of data of a database of this kind is not vital for the completion of the task at hand (e.g. fix a display problem for a particular browser). Normally, dumping, downloading and importing a large database is time consuming, and we’d rather not waste a lot of time on short-term projects.
It still brings a smile, remembering a colleague of mine having to wait for 30 minutes to import a 2 GB database on his local machine. This has prompted me to look for a way to optimize this process.
So, having googled for a while, we came across an article entitled Magento Log & Cache Maintenance Script explaining how to periodically clean up some of the bulky Magento tables without disrupting the store’s functionality.
It occurred to me that we could have a fully-functional Magento store, where the data from the tables listed below, in the so called IGNORE LIST, is not required:
- adminnotification_inbox
- aw_core_logger
- dataflow_batch_export
- dataflow_batch_import
- log_customer
- log_quote
- log_summary
- log_summary_type
- log_url
- log_url_info
- log_visitor
- log_visitor_info
- log_visitor_online
- index_event
- report_event
- report_compared_product_index
- report_viewed_product_index
- catalog_compare_item
- catalogindex_aggregation
- catalogindex_aggregation_tag
- catalogindex_aggregation_to_tag
I decided to draw on the log tables clean up script idea, and wrote a script that creates a dump for a small-sized database.
define('DS', DIRECTORY_SEPARATOR);
function _getExtractSchemaStatement($sqlFileName, $db)
{
$dumpSchema = 'mysqldump' . ' ';
$dumpSchema .= '--no-data' . ' ';
$dumpSchema .= '-u ' . $db['user'] . ' ';
$dumpSchema .= '-p' . $db['pass'] . ' ';
$dumpSchema .= '-h ' . $db['host'] . ' ';
$dumpSchema .= $db['name'] .' > ' . $sqlFileName;
return $dumpSchema;
}
function _getExtractDataStatement($sqlFileName, $db)
{
$tables = array(
'adminnotification_inbox',
'aw_core_logger',
'dataflow_batch_export',
'dataflow_batch_import',
'log_customer',
'log_quote',
'log_summary',
'log_summary_type',
'log_url',
'log_url_info',
'log_visitor',
'log_visitor_info',
'log_visitor_online',
'index_event',
'report_event',
'report_viewed_product_index',
'report_compared_product_index',
'catalog_compare_item',
'catalogindex_aggregation',
'catalogindex_aggregation_tag',
'catalogindex_aggregation_to_tag'
);
$ignoreTables = ' ';
foreach($tables as $table) {
$ignoreTables .= '--ignore-table=' . $db['name'] . '.' . $db['pref'] . $table . ' ';
}
$dumpData = 'mysqldump' . ' ';
$dumpData .= $ignoreTables;
$dumpData .= '-u ' . $db['user'] . ' ';
$dumpData .= '-p' . $db['pass'] . ' ';
$dumpData .= '-h ' . $db['host'] . ' ';
$dumpData .= $db['name'] .' >> ' . $sqlFileName;
return $dumpData;
}
function export_tiny()
{
$configPath = '.' . DS . 'app' . DS . 'etc' . DS . 'local.xml';
$xml = simplexml_load_file($configPath, NULL, LIBXML_NOCDATA);
$db['host'] = $xml->global->resources->default_setup->connection->host;
$db['name'] = $xml->global->resources->default_setup->connection->dbname;
$db['user'] = $xml->global->resources->default_setup->connection->username;
$db['pass'] = $xml->global->resources->default_setup->connection->password;
$db['pref'] = $xml->global->resources->db->table_prefix;
$sqlFileName = 'var' . DS . $db['name'] . '-' . date('j-m-y-h-i-s') . '.sql';
//Extract the DB schema
$dumpSchema = _getExtractSchemaStatement($sqlFileName, $db);
exec($dumpSchema);
//Extract the DB data
$dumpData = _getExtractDataStatement($sqlFileName, $db);
exec($dumpData);
}
export_tiny();
You can download the script from: https://gist.github.com/ceckoslab/4495889
In practice, the script creates a version of the database that has the same structure as the original one, except the data of the tables from THE IGNORE LIST.
How to use it:
1. Copy the script to the Magento root directory.
2. 1. If you have ssh access, execute: php tinydump.php
2. 2. If you do not have ssh access, you can try running the script from your browser: http://mymagentostore.comtinydump.php
3. After the script does what it does, inside the var directory of your Magento installation you will find an sql file whose name will look like this: {DB NAME}-{date(‘j-m-y-h-i-s’)}.sql. Example: var/mymagentodb-7-09-13-12-39-42.sql
In some cases, a 2 GB database was reduced to 170 MB and the total time needed to export, download the sql file and re-import it to our local machines was significantly reduced.
Exceptions:
In some cases, the hosting company has restricted execution of the system() and exec() php functions, or has prohibited the execution of the mysqldump command. The script I have written uses exec() and mysqldump, and may therefore fail if the above restrictions are in place.
However, you can use phpmyadmin but perform the following two steps while exporting the database:
1. We are only exporting the structure of the database, so we need to select the following options:
- Tables ( Select All )
- Disable foreign key checks
- Dump table ( structure )
2. We are only exporting the data of the database, so we need to select the following options:
- Tables ( Select All tables except the tables from THE IGNORE LIST )
- Disable foreign key checks
- Dump table ( data )
The result will be 2 files where the first will contain the structure of the database, and the second will contain the data from those tables containing the essential information.
In order to start the local installation, we first need to import the structure file, and then the data file.
I hope you find this article useful 🙂
10 comments
Hi Tsvetan,
Any suggestion for M2? I am having around 3GB of M2 DB.
Thanks
thanks mate!
wow this is a really great post. with all examples i am going to apply it in my project too..keep sharing it here.
Script not working with Magento 1.7
Nice instructions for manual export/import, thanks to this article I could import a 5GB database which got converted into a 500MB one.
I get the following error:
Warning: simplexml_load_file(): I/O warning : failed to load external entity “./app/etc/local.xml” in /home/bluebal/tinydump.php on line 60
sh: var/-2-07-14-10-44-18.sql: No such file or directory
Mag 1.7
Amazing script. In my case I was able to shrink down the database from 1GB to 250 MB.
Hmm… I was here looking for a solution to my problem. But until now I still don’t know how to fix it. I have a magento database with around 50MB of data. But when I create a dump file, I can only dump around 900KB. Which is really weird.
I also have another table with 600MB of data, but I can dump all of it. This is just in the same server using magento. The db type of both is InnoDB.
I assume that there seems to be a problem with my other Magento database, but I know it’s not corrupted. I still don’t know how to export it completely. I’m now having a problem exporting it and importing it to another database, because I can only export less than 1MB of that 50MB database.
I’ve been browsing inchoo for hours now but I still can’t find a solution to my problem. Not even Google can help me. I hope somebody here will reply and have encountered this problem before. 🙁
T______T
That’s cool, but not best sollution )))
1. You wroted here about some db tables that not from magento core (aw_core_logger – I guess that that’s some core logging extention from ahead works team), so how can we know that in other project we have no another 3-d party extentions with big tables that not requiered for us?
2. In some projects / extentions we can need log / report tables data
So I mean that in different tasks we can have many different situations and I see best way here as:
1. Export only needed data as you explained here
2. Use some additional software to import only needed data (not advertising – for example ___sypex.net___ / en / products/dumper/downloads/ ) – I think it’s much more comfortable
And one more – It’s really not best practice, but for little bug fixes like fix display problem for particular browser it’s really difficult and takes long time to create local repo / copy – In such situations. In such situations I try to fix this on the fly over plugins like firebug / developer tools, or do this on test page over viewing bugged block on it, or finnaly do that work at night, when not so much customers online )))
This is actually a very rare situation for me, because most of my clients already have development copy/envorement for such things.
So I wish all to have customers with pre-configured devepompment envorement )))
Happy New Year
You missed this line in the dumpData variable:
you can just use n98-magerun https://github.com/netz98/n98-magerun