Magento SEO: How to handle problems caused by layered navigation?

layerednav

Layered navigation, a feature available in Magento without any extensions is commonly used by many merchants around the world. It is also one of the most painful Magento features for the SEOs. It creates lots (depending on the amount of filters and products – often tens of thousands) of terrible URLs with duplicate / near duplicate content and identical page titles and descriptions.

I made this video to show you what are your options and what is in my experience the best possible solution for handling Magento layered navigation indexation issues. I hope it helps:

After you watched the video, you know what to do. How do you know if you did it correctly? Log-in to your Google Webmaster Tools, click on Health -> Fetch as Googlebot and see if layered navigation shows up.

I hope I helped. Anyone have a different experience and advice on this issue?

If you need any help, we can do a Magento Website Assessment for your site.

50
Top

Care to rate this post?

Author

Toni Anicic

E-commerce Consultant

SEO. Professional gaming. Home-brewed beer. Billiard.

Other posts from this author

Discussion 50 Comments

Add Comment
  1. Could you not add rel=canonical on all the filtered pages back to the main category?

    I know that Amazon also has quite a complex cloaking solution to filterable navigation – I believe someone on SEOmoz did a write up about it a while ago. http://www.seomoz.org/ugc/dealing-with-faceted-navigation-a-case-study

    Cheers!

  2. Toni Anicic

    Ed,

    What works for Amazon is usually not the best solution for a mid-size online merchant. Amazon has amazing amount of link juice to distribute and they can afford having a bit different navigation then the rest of us and still rank well.

    Regarding rel=canonical, short answer is no.

    Long answer is rel=canonical can point to the same content, differently organized, not different content or “thinner” content. Think of it as a tool to show that https version of the URL is actually a duplicate of http version. That some ?ref=abcd parameter for example that doesn’t really change the content of the URL is just a canonical version of the one without the parameter etc. But, as soon as the content changes, canonical is not applicable.

  3. Very true but it certainly is one approach to look at that seems to be working for them. Perhaps it would make for a great module :)

    I would argue that rel=canonical would be appropriate here, you have the “same” content spread over various pages which all in effect relate to your main category. Without a better solution, at least this method would remove the duplication issue and not lose you that value juice.

  4. Dan

    Thanks for the great explanation. I’ve been suffering from duplicate content (at least in part due to layered nav) and it’s great to know that you guys have found a working solution. One suggestion though – maybe get a wireless microphone for the next video? Sound quality was pretty poor.

  5. I’m with Ed here on the canonical tag.

    It seems to me that layered navigation is to categories what css is to content. It feels very much like a way for the user to interact with the same content in a different way (without restructuring the same content each time) but we still want Google to index the category just once, regardless of the various view opportunities.

    Unless I’ve missed something I’ll later come to regret doesn’t canonical allow us to do exactly that?

  6. Toni Anicic

    Michael,

    You usually have both normal AND layered navigation. So normal navigation will stay and be indexed, root categories and subcategories will be indexed. In layer navigation those are parameters (attributes) which filter the data, not categories, you don’t want them indexed, it will be a mess of duplicate content, identical titles and descriptions.

    Canonical, as I explained earlier is only to point the same or very similar content to the original source. It’s not applicable to layered navigation filter pages. You can try it but Google will usually ignore it.

  7. Toni Anicic

    BTW. I just had a quick e-mail chat about this with Rand Fishkin (SEOmoz CEO) and he agrees with my approach on handling this issue.

  8. Ah I see your point on canonical and there’s a nice rundown from Google which supports that in regards to its intended use: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394

    What I would say is that whilst it may not be entirely correct to use, canonical seems to have worked for us in these scenarios, so perhaps it’s a little hit and miss. Either way it’s nice to have the current options explored.

  9. Steve

    This is by far the best discussion on the web on this topic & Magento. /hatTip

  10. Emanuel

    So, any nice guidance on how to achieve this using the cookie trick?

  11. Patrick

    This really can’t help excluding many possibility with 3 and half minutes , then just go through a solution with 30 seconds in a hurry. Can anyone tell how exactly?

  12. Toni Anicic

    Patrick,

    I thought most people would like to know WHY we suggest this solution and not some of the other options and from the first comments I can see that was the right approach of presenting the solution. You should always know why are you doing something the way you’re doing it, not just blindly follow the advice you read somewhere on the internet. Especially when it comes to SEO. I’m sorry I wasted you 3 and a half minutes by explaining how and why we did it the way we did in a completely free article and advice that I filmed and written for you and your business to benefit and that you chose to read and watch for free. I’m such a terrible person.

    Emanuel,

    If/when developers find some time, they’ll make a guide on how to achieve this from a technical perspective. For now, you can do the same thing my developers did when I told them what needs to be done: Google how to detect if cookies are enabled.

  13. Thanks for your reply, Toni!

    I will spend some time later trying to find a solution and I will post it here if/when I manage to get the job done. :)

  14. Interesting article on a big issue with SEO and duplicate content. I look forward to seeing a working solution.

  15. In my opinion the combination of noindex, follow and gwt parameter is the best solution. 1.) If you love your shop your domain … onpage optimization should be white and not black or grey. 2.) With noindex, follow you direct the power to the products and you get more links to each product. Pagerank is a relict from better times ;-)

  16. Ben

    Hi Toni,

    The SEO Layered Navigation plugin claims to hide layered navigation html from the page source in case you need to prevent indexing by search engine spiders.

    Layered navigation html is encoded with php and then decoded with javascript so search engine bots don’t see any links, links are only available for users with enabled javascript.

    What is your opinion on this solution?

  17. Toni Anicic

    Hi Ben,

    I saw there’s a SEO Layered Navigation extension, however, I never tested it, since our usual experience with Magento extensions is that they are not coded very well except for a very few extension providers we trust.

  18. Ben

    Hi Toni,

    Thanks for your reply. Yes, there are a few extensions that do similar things. I’m trying to decide between one that enables SEO friendly URLs for attributes (rather than ?manufacturer=2&color=1) or the GoMage advanced navigation that lets the “?manufacturer=2&color=1″ part be hidden completely. Ideally trying to eliminate the duplicate content from the layered nav. Any thoughts?

  19. Toni Anicic

    @Ben,

    If I had to go with an extension, I think I’d choose the GoMage solution.

  20. Johno

    So link juice still is flowed to pages on a site that have Meta – No Index on them or are blocked out by robots.txt?

    I thought the link juice would be blocked out??

    Are you sure about this?

  21. Johno

    Nofollow on the links plus meta=noindex would work :)

  22. Toni Anicic

    @Johno,

    Yes link juice flows through the noindex URLs.

    No, nofollow on internal links is even worse idea than noindex, follow, since 2009., it’s a common misconception, read this: http://inchoo.net/ecommerce/why-relnofollow-in-ecommerce-menus-is-a-bad-idea/

  23. Hi Toni,

    If you had to pick between the GoMage extension detailed above and this – http://amasty.com/improved-navigation.html – which would you recommend?

    Cheers,

    Steve

  24. Toni Anicic

    @Steve,

    It looks good in features, but I’d give it to my developers to inspect the code before putting it on a live site.

  25. Ming

    @Toni,

    Interesting article / video. How would you handle layered navigation pages already indexed?

    By implementing a hidden layered navigation, past pages would still be in the index?

  26. Toni Anicic

    @Ming, yea… they will. Since there will be no more links towards them, at least no internal, you can remove URLs from index through Google Webmaster Tool, although that will take a lot of time, but I don’t see a much better solution.

  27. Ming

    @Toni,

    Thanks for the reply, from what I understand in your video:
    1) nofollow is terrible (this I understand, pretty common knowledge).
    2) noindex (wouldn’t this be good to use to remove urls from index rather than GWT?) – I understand that while you can remove the urls from index, the link juice will still flow through, but of course, its a bit wasted?
    3) robots.txt (I never see the point to use this, easier to maintain it via a meta).
    4) GWT paramaters (agreed, we use this and it does pretty much nothing, stuff gets indexed and it doesn’t help with duplicate content).

    I like your solution of hiding the layered navigation from google to prevent link crawling…

    But wouldn’t a more comprehensive solution be:
    1) Hide the layered navigation (i.e. your solution)
    2) Use noindex on layered navigation pages to remove already indexed pages from google
    3) Use rel canonical on layered navigation pages to tell google which page is the original (or do you think using this with noindex is pointless? noindex + canonical, or one of them?)
    4) Use GWT parameters anyways?

    Would this be worse than your solution? From what I can work out, essentially avoid nofollow and robots and have the hidden layered navigation and this is the gist of it?

    Thanks Toni!

  28. Toni Anicic

    Hi Ming,

    Noindex is a funny thing, it actually doesn’t mean “You can’t index this”, it means “You can’t show this in search results”. Robots.txt disallow means “You can’t index this” but it doesn’t mean “You can’t show it in the search results”.

    So, noindex shouldn’t really “de-index” already indexed pages. But, since we hid the layered navigation, there are no more internal links to it so no link juice is passed to them anymore.

    The only remaining problem are those already indexed links since they can show up in the search results, but when you think about it, it’s not really an issue anymore, they can remain in index.

  29. Ming

    @Toni,

    Ahh very important distinction, I was trying to work out the difference between noindex and robots.

    Makes sense, essentially hiding the links will stop link juice passing, and the old stuff in the index not really a issue since the main pages should be ranked higher in any case because of the change.

    Is there a GWT bulk remove uploaded btw?

  30. Toni Anicic

    From what I know, you can only remove URLs one by one :(

  31. Toni Anicic

    Hey everyone, there is a new video in the series of these Magento SEO video tutorials, if you are interested: http://inchoo.net/online-marketing/magento-seo-check-your-extensions/

  32. Hi Toni,

    thanks for the video, I am with you that what you are decscibing (good old page rank sculpting basically) might be the best solution to control link juice and distribute it amongst categories.

    But I am also with Ed and Michael that the Canonical* (even though its not 100% for what it is indended for) might lead to simalar effects.

    *I am not sure though (anyboy know an the answer) how much (if any) link juice is passed through outgoing links on filter pages that have a Canonical back to the orignial unfiltered page.

    Just applying the noindex would push more linkjuice down to the products, as Peer suggested. The downside: it would almost distribute the linkjuice amongs all products mostly equally, if there are a lot of filters available. And it would weaken the category sites a lot!

    So I would agree with Toni that the best solution probably is PR Sculpting – Ideally in combination with smart sorting so that your top products get displayed on top and get most link juice. If you have many products in a category you would have to deal with pagination (here noindex might make sense to help get all your products indexed).

    Toni: What do you think of using filter pages as landingpages instead, making the Ajax crawlabe? I assume that all filter combinations have enough search volume, so this would make sense…

  33. By the way the difference between robots.txt and noindex is a bit different then you described above:

    “Noindex is a funny thing, it actually doesn’t mean “You can’t index this”, it means “You can’t show this in search results”. Robots.txt disallow means “You can’t index this” but it doesn’t mean “You can’t show it in the search results”.”

    Noindex => You can read this but you can not index it

    Robots.txt => You can’t read this but you can index it

  34. Toni Anicic

    Hi Raoul,

    Regarding robots, yea, depends how you define “indexing” but if you think about it, reading something as a robot is indexing, it just depends in which index you store it :)

    Filter pages in Magento’s layered navigation are really not optimized for SEO, you have no control over title, meta data, URLs are not SEO friendly… so using them as landing pages would require heavy modifications.

  35. Assuming that you would do these modifications where would you see the downside?

  36. Toni Anicic

    Depends on how much link juice you have to work with in the first place. Layered navigation could create literally millions of URLs, depending on the amount of attributes and products. You might not wanna disperse all of your link juice on a really big amount of URLs unless you’re going for a very very long tail. With such a set-up you can totally forget about highly competitive broad match short keywords.

  37. Yes I agree with that! Thanks a lot for your input, really good discussion going on here…

  38. This sounds like very sensible advice and I achieved it by putting a check in /templates/catalog/layer/filter.phtml.

    This is the function I’m using to check and it seems to work just fine, having looked at a few pages as Googlebot.

    	function IsGooglebot(){
    	// check if user agent contains googlebt
    	if(eregi("Googlebot",$_SERVER['HTTP_USER_AGENT'])){
    	$ip = $_SERVER['REMOTE_ADDR'];
    	//server name e.g. crawl-66-249-66-1.googlebot.com
    	$name = gethostbyaddr($ip);
    	//check if name ciontains googlebot
    	if(eregi("Googlebot",$name)){
    	//list of IP's
    	$hosts = gethostbynamel($name);
    	foreach($hosts as $host){
    	if ($host == $ip){
    	return true;
    	}
    	}
    	return false; // Pretender, take some action if needed
    	}else{
    	return false; // Pretender, take some action if needed
    	}
    	}else{
    	// Not googlebot, take some action if needed
    	}
    	return false;
    	}
    
  39. Toni,
    Thanks for the interesting article and for explaining it. Unfortunately, you don’t provide a clear step by step guide to solve the problem.
    How do we create an Ajax box?
    Can we jsut turn layered navigation off? How is that done – just by using the anchor option in the categories?

    Apart from that you mentioned in the comments that you only a few extension providers. Who are these and why do you trust them.
    Apart from that, what do you think of Nitrogento and Lightspeed extensions?

  40. I added the GoMage Advanced Nav (ver3.2) for Magento and then checked in the ‘Fetch as Google’ code to see if Ajax was hiding the sort-by filters… Im not a programmer so I am not sure it has worked?

    It looks as though it is written different and I can clearly see it is using the plugin by the term ‘ajax’, but shouldn’t this be invisible now?

    Sort By

    Position

    Name

    I can see the ‘loading please wait’ appearing when you select from drop down filters, pagination links etc…. So again I know the ajax plugin is installed and in use. I just thought option values above wouldn’t be there anymore? or is it a case of Google can no longer follow them?

    Sorry im new with this and thanks.

    Michael

  41. Michael- I also was trying the GoMage Advance Nav extension. It does not use the Agent Detection method as described here. According to GoMage, that method will be released in a future revision. Instead they currently use a “rel=nofollow” method and a robot.txt method to exclude the ajax code from indexing. They charged me $50 for a modification fee. I think Sinfill Thrill is on the right path.

  42. Matthijs

    Hi Toni,

    Thanks for this nice video. You point out the duplicate content issue from a “layered navigation” point of view. For my site I have set URL parameters for these layered navigation filters (color, brand, etc.) to “No URLs” in GWT. Google tends to follow this instruction pretty good.

    My problem is more related to the sorter options on the category pages (mode, limit, order and direction). I have also set URL Parameters for these filters in GWT, but Google does not obey them… :-(

    You don’t mention these filters in your video. What would be a good solution to prevent google from following these sorter-links and indexing these filter pages?

    Thanks!

    Matthijs

  43. Cheng

    I hope you don’t mind a litte bit of promotion for our module. It will help to improve your URLs of the layered navigation.

    The Urls are created with the code of the filtered attribute and its values.

    http://www.helloostore.com/helloo-filterrewrites.html

  44. Hi, very nice article. But maybe is there someone with practical approach done on the matter of best solution mentioned in the video?

  45. I got an idea to wrap the layered nav in such code, maybe this will help someone?

    <div id="filters-no-follow"></div>
    <?php
    function prepare_for_echo($string) {
    $no_br = trim(preg_replace('/\s+/', ' ', $string));
    $no_slashes = str_replace('\'', '\\\'', $no_br);
    return $no_slashes;
    }
    ?>
    <script>
    function please_enable_cookies() {
    var f = document.getElementById('filters-no-follow');
    f.innerHTML = '<div class="no-cookies-error">Enable cookies to choose filters.</div>';
    }
    function please_load_filters() {
    var f = document.getElementById('filters-no-follow');
    f.innerHTML = '<?php if ( !empty($filtersHtml) || !empty($stateHtml) ): ?>'
    + '\n<div class="block block-layered-nav">'
    + '\n    <div class="block-title">'
    + '\n        <strong><span><?php echo prepare_for_echo($this->__('Shop By')); ?></span></strong>'
    + '\n    </div>'
    + '\n    <div class="block-content">'
    + '\n        <?php echo prepare_for_echo($this->getStateHtml()); ?>'
    + '\n        <?php if ($this->canShowOptions()): ?>'
    + '\n        <p class="block-subtitle"><?php echo prepare_for_echo($this->__('Shopping Options')); ?></p>'
    + '\n        <dl id="narrow-by-list">'
    + '\n            <?php echo prepare_for_echo($filtersHtml); ?>'
    + '\n        </dl>'
    + '\n        <?php endif; ?>'
    + '\n    </div>'
    + '\n</div>'
    + '\n<?php endif; ?>';
    }
    function are_cookies_enabled()
    {
        var cookieEnabled = (navigator.cookieEnabled) ? true : false;
        if (typeof navigator.cookieEnabled == "undefined" && !cookieEnabled)
        {
            document.cookie="testcookie";
            cookieEnabled = (document.cookie.indexOf("testcookie") != -1) ? true : false;
        }
        return (cookieEnabled);
    }
    if(are_cookies_enabled()) {
    please_load_filters();
    } else {
    please_enable_cookies();
    }
    </script>
  46. Hi,

    We have GoMage ‘Advanced Navigation’ which uses AJAX to filter layered navigation but I just noticed the following when looking at the page code:

    Boots

    Does it mean they just use rel=”nofollow” to stop bots from indexing which is a ‘no go’ as you all say here or I am missing something.
    Please guys, could anyone help.

  47. sorry, the code got deleted:
    >a href=”http://www.shiptonandheneage.co.uk/mens-shoes.html” onclick=”GomageNavigation.click(this); return false;” rel=”nofollow” data-url=”http://www.shiptonandheneage.co.uk/mens-shoes.html” data-param=”?cat=22&dir=asc&limit=28&mode=list&order=price” data-ajax=”1″>Boots>/a<

  48. Manu

    about the canonical: you can use it, if you point every categorie-site to his one “canonicalsite”. catergorieabc?length=124p=2 (p02 for second categorie page) points to rel=canonical=catergorieabc?p=2

  49. Pere

    What about using noindex for the pages that have some parameter, and nofollow to all the links of the Layered navigation?

  50. KG

    Tell me please someone… is the following a free solution for this? i have no budget for a fancy Extention…

    1. Go to:

    PathToThemeTemplateFiles/priceslider/slider_layered_nav.phtml

    Add to line 1:

    <?php if (strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot") || strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "bingbot") || strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "slurp") || strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "msn")): ?>
    <?php else: ?>
    

    2. then add to the last line:

    endif; ?>
    

Add Your Comment

Please wrap all source codes with [code][/code] tags.
Top