One of the differences between Magento eCommerce platform and lets say WordPress, when looked from developer point of view, is “avoid raw queries” approach. For a web application, Magento is massive system. His database, although not so massive but surely breathtaking with around 220 tables forces you to use “eye candy” EAV model to do even simple things.
For instance, if I were to tell you that I want you to retrieve 5 simple products from database that have price in the range of 200-600USD you would most likely spent 1-4 hours trying to work out the query. In the end you would be lucky if you even get the query that wont break on next Magento upgrade. Let me just remind you on some of the things you need to be careful in example above.
Your query would have to take in consideration the at least the following states: is my product visibility such that it can be seen in Catalog and search result, is my product enabled or disabled, is is out of stock, which website and store view has it been assigned, is it assigned to only one category and that category is disabled by some chance, does it have special price assigned, does it have promotion rules assigned, does it have content (descriptions) for multilingual site assigned.
All of those and more are questions that one has to take into consideration when doing raw database queries. It makes no sense to create query that fetches product and all its appropriate info (price, quantity, description…) if that product is disabled or assigned to category that is disabled. For instance, if you need “featured” product functionality and you manually create raw query that extracts one “featured” product to lets say home page. There you create nice block holding all the necessary info of the product with a link for lets say “Add to cart” or “View product”. Clicking one of those two link would give you most likely 404 if the product is disable or any other condition that enables the product to be shown is disabled.
So basically, raw SQL queries in Magento are a “no no” for most of the time. Too much work and you never know what they will change in future upgrades. Which brings me to this new trend of theirs which I like to call “Escape from EAV”.
EAV, also know as Entity Attribute Value was suppose to bee (or is) the next big thing in OOP. Anyhow, Magento and his way of EAV have two huge downfalls called LACK OF DOCUMENTATION and WHAT WILL THEY CHANGE IN NEXT UPGRADE. The other day I was working on a Magento shop that had 22 000 products. Almost all of them were simple products, just about 600 were grouped. The store was initially installed on Magento 1.2 version. Site was pretty much 95% complete from both graphic and development perspective. Anyhow, Magento version 1.3.1 was released and on client demand we decide to to upgrade. Important thing to keep in mind is that in version 1.3 Magento introduce “change in philosophy” concerning the EAV model.
Altough EAV sounds great (and really is great for most of the time), it can really slow things down with all those JOINS executed on database. So the Magento team decided to do a little flat tabling. Basically we now have massive data duplication in MySQL where data is taken from various tables and copied int one, the flat table. Flat tables were introduced for both Products and categories, in regards to various website and store views.
Basically Magento has the power to “on the fly” create tables and do the “magical” copying of data from various tables to the flat tables. I assume they were looking for a faster way to “assemble” Product objects in Magento which in turn should boost the speed of collection object i grew so found about. I can live with duplicated data in database, I mean I am not the one writing them down. But let me get back to real world scenario.
As I mentioned, store I have been working on had more than 20 000 products. Due to limitations of both MySQL and PHP failed to do Rebuilt that you can find in System > Configuration > Cache management. This failure caused corrupted data in database. This manifested with “empty” attributes (attributes not showing up in Layered navigation). I resolved the issue by “re saving” all of the 20 000 product with custom script I wrote. After that i noticed the data in flat tables got rebuild as well. This of course took several hours. Magento EAV model seems OK on paper, but with store that hold’s large number of products it simply does not work, at least not in way in which their collection objects are built. Changes introduced in form of flat tables in versions 1.3 should, to some extent, improve work with stores that have large numbers of products. However this improvement seems to come in form of “copy all data from scattered tables to single flat table”.
Solutions like these are nightmare when it comes to upgrading the store that already has large number of products in. System is expected to handle part of the job transparently but it failed, leaving the client with corrupted database.
So, a word of advice for upgrading an existing Magento store: NEVER do it on live site. Let professional developer transfer live site and database to his dev machine, or make a copy on some sub folder on live site. Magento’s extreme out of the box feature rich capabilities, open source philosophy and good marketing have made it extremely popular but be careful, free is such a a loose term.
I am really anxious to see the future path of EAV vs Flat model in Magento.
PS. This topic was discussed on two more following articles from my colleagues. Tomislav Bilic firstly introduced Magento MySQL database diagram back in 2008, while Mladen Lotar expanded this intro and provided more details at his Magento’s database layout and it’s EAV structure article at 2010.