Thou Shalt Not Do Inserts in a Foreach … Unless You Know The Trick
TL;DR;
Don’t do it. Just, simply, never write a database query inside a loop. Ever. Put in a little bit of effort and write code that will insert or read all data in one big batch, or at least group big chunks of data in the smallest possible number of queries. That’s it. Move on. These aren’t the droids you’re looking for.
But I’ve been blessed with the virtue of laziness!
Ah. Ok then, read on.
Being blessed with the same virtue myself, I’ve taken a shortcut here or there too, when i was sure no one knew it’s in a non-performance-critical piece of code.
But, that really clashed with my other virtues of impatience and hubris. What if that foreach gets a lot of data, and queries take forever, strangling the server? What if someone saw that code and got sick? Those questions kept me awake in the night and woke me, sweaty, in the wee hours. I could not look at my image in the mirror without a sense of shame.
Things could not go on like that any more.
So I learned of this little trick.
The Trick
Take a good, hard look at this piece of code:
$model = Mage::getModel('your_model/dbtable');
$data = [/* Array with lots of data */];
foreach ($data as $row){
$model->setData($row)->save();
}
We have all seen it numerous times. Let’s admit it, we wrote code like this at some (bad) point in our life. We’ll call this one “stupid foreach”. What makes it so slow? Transactions and disk writes. For each iteration, database will make the lock, write data to it, unlock it, make a network request, etc.… And that takes time.
That’s why you really should do this instead, somewhere in your model resource:
$this->_getWriteAdapter()->insertMultiple($this->getMainTable(), $data);
That writes the data in 1 transaction, no matter how big it is. And it’s even less code than the stupid foreach! Yet, stupid foreach creeps into code again and again.
Enter the trick:
$model = Mage::getModel('your_model/dbtable');
$resource = $model->getResource();
$data = [/* Array with lots of data */];
$resource->beginTransaction();
foreach ($data as $row){
$model->setData($row)->save();
}
$resource->commit();
Yep, it still looks stupid, it’s the most code of the three, but this also gets the data into that database in one commit. And, it still works if you have a really messy foreach with database reads and writes combined.
Talk is cheap, show me the numbers
Here’s what the speed tests say (time in seconds):
100 iterations | 1000 iterations | 10000 iterations | |
Insert multiple | 0.017 | 0.079 | 0.379 |
Foreach in a transaction | 0.074 | 0.501 | 4.701 |
Stupid foreach | 1.097 | 11.729 | 117.410 |
What’s the point?
What if you have a huge CSV that needs to be imported, and it just doesn’t fit in memory in one big batch? Reading and inserting line by line would work. What do you do when you come across badly written foreach inserts, and don’t have time or energy to refactor that? Or whatever. You know, stuff in the real world. Daily survival in backend development trenches.
The answer is simple. Yes, it’s always possible to do it properly and you really should, but if you can’t for whatever reason, at least do the trick. It’s just two lines of code around the foreach, and the performance is an order of magnitude better than the stupid foreach. Yeah, it’s still an order of magnitude worse than doing it properly, but it’s a horrible world out there.