Handling intensive Magento tasks in the background, simulating AJAX via CRON

Featured Image

For those of a weak hart, please skip this article. Before I start presenting the idea, please note that this is just an idea. So what’s it about? In our everyday Magento development there are numerous examples/tasks where we need to import some data into Magento or programmatically create some entries like products, orders, customers, etc. When dealing with larger scale of things, importing/creating 50000 of products can turn out to be a real challenge. Usually we (the developers) strive to a simplest possible solution, which in this case ideally would be to make simple foreach loop where we would have our code handling product creation in each iteration of the loop. Technically this works, but with extreme PHP/environment configurations only.

Second thing that might come to a mind is AJAX approach, like the one seen on the default Magento store where you initiate your product import. This solution is better than previously mentioned as it eliminates the system ending up “resource-less” . If you catch my point. There is one major drawback with this approach as well. That is, this approach works only while your browser is turned on.

Third approach, that might be interesting to some is somewhat combination of the first two. I call it “simulating AJAX via CRON”.

Let’s give a practical example of clients request.

Client: I have an old web shop system that I would like to switch to Magento. There are plenty of orders in that system that are both “finished” and “active” and I would like to preserve (switch) these orders to Magento with appropriate statuses.

You: Sure, for starters can you give me a rough number on how many orders are we talking about.

Client: Roughly there are somewhere around 40 000 – 44 0000 orders. Most of them are “finished”, however around 2 000 are active. Regardless of that I would like them all to be “transfered” to Magento so I can have a reference to them all if needed later. My current web shop system stores all order data in one flat table. Is this doable?

You: Yes, it’s doable. However, this is somewhat complex process so solution won’t be quick. What needs to be done, roughly described, is transfer/creation of full customers data to Magento, then programmatic creation of orders with properly assigned customers, etc. We cannot simply query database directly as more valid approach would be to use available Magento models/classes. With this great number of orders that need to be handled we will need to AJAX-iffy the process or come up with “incremental” approach that would not clutter, break up, the system.

Client: Cannot say I understand it all, but I have no doubts you will come up with best solution. Let me know if you need my input on anything else. Also, when can I expect the first results?

You: …

After the discussion with the client, you realized that this is a perfect example of both complex and time consuming task similar to the the default Magento product importer. The only real difference here is that you do not with your client to stare on the browser window looking at “AJAX-ed import process” while you custom coded script imports/creates cca 44 000 orders and proper customers.

So, why not apply the “simulating AJAX via CRON” approach?! Basic idea is that you simply move all the code/functionality that was to be called via AJAX-ed into the CRON executed code, then have your CRON trigger every minute.

To make this function you need to:

  • Add additional column into the clients order table called “myscript_is_processed”. This column can hold NULL for “non processed orders” or “1” for processed.
  • Make your custom script code such that once it is triggered via CRON, it grabs small set of random entries from clients order table where flag set on the “myscript_is_processed” table does not mark it as processed. During this step you can do all sorts of logging action for later review/tracing if something went wrong.
  • Parse the orders from clients table, handle their creation on Magento side, then set the proper flag on the clients table “myscript_is_processed” column.
  • Next time CRON runs your code it will look for another small random set of “non-processed” orders in the clients order table by looking at “myscript_is_processed” column.

This way, you do not have to have any browser opened during this process.

This approach itself is not ideal as it has it’s drawbacks. However, it might give you some clue on how to handle certain issues/tasks a bit differently then you are used to.

Cheers.


2 comments

  1. We have had some large imports like this to handle in Magento and are about to imbark on a product loader for a large webstore.

    The solution we have is a Cron executed version of the Magento dataflow and it is indeed a much faster and slicker solution, implemented in fullness you can create status updates and notifications which are visible in the backend of Magento which we are now bringing into a full internal module for this sort of processing.

    Being able to use a more direct approach skipping the AJAX calls will save us a lot of man hours and time.

    At the time of writing the module does stock quantities only at the moment as a test and runs through 20,000 records in just under 2 minutes (Still using the Magento core to do this)

  2. Magento DataFlow is not so good like it should be. And I understand why customers frustrated with waiting while Magento is importing some data.

    Sometimes easier to write your own profiles that imports data to internal database structure and then coverts it to EAV (Catalog, Customer) or Flat (Orders) structure via SQL queries. At least it wont be longer than 5 minutes to import 44 000 of items. It is well known practice to minimize manipulation of data in PHP if MySQL handles large amounts of data incredibly faster. But it is my opinion and which kind of import to use is up to you, guys.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <blockquote cite=""> <code> <del datetime=""> <em> <s> <strike> <strong>. You may use following syntax for source code: <pre><code>$current = "Inchoo";</code></pre>.