Mittwoch, 17. September 2014

Faceted search, SEO and user experience: how to and why?

SEO of faceted search
Certain ecommerce sites with only few product categories and some thousands of products are able to generate thousands upon thousands useless URLs, through product search, product filter and product option URLs. Sad, but true. We can't do as if this problem wouldn't exist. To leave such URLs unhandled would bring tons of negative SEO impact. There are only few kinds of dealing with such URLs:
  • to get rid of them completely,
  • to turn a part of useless URLs into useful, and
  • to reduce the negative SEO impact of the remaining useless URLs.
There isn't the magic method - no one of existing SEO techniques does the trick alone. What works is the combination of SEO techniques, which i collect in this article.
 Lets look →

How to decrease an amount of useless product search and options URLs and to turn them into useful

In short, through categorization. In detail: run keyword research based on products (with AdWords keyword tool or use just the suggestion). Then create product categories based on the most promising keywords. If you don't like the word categorization, you can the procedure i recommend something like search faceting, or creating of dedicated landing pages for popular search facets... The meaning is always the same: you create links to search queries, anchored with popular keywords. This is btw, the difference between facet search and search filter:
  • facets are links to keyword-anchored search queries, which should be better indexed,
  • filter (it can be a product filter and a product options filter) are links, created through filter checkboxes and dropdowns, which appear in a great amount and should be better deindexed cause of duplicated content and useless crawler load factor.
Through creation and narrowing of search facets, creating of permalinks to the best search result pages and making product search and options pages bookmarkable you reach many goals simultaneously:
  • get rid of some useless URLs, which would be created by product filter,
  • increase the site's UX,
  • gain the conversion rate and
  • enhance your product search.
If selecting keywords for categorization, keep in mind, that the whole URL must better be no longer as 115 characters!

How to get rid of useless URLs completely

Somebody could mean, hey, if they are useless and must be better hidden from indexing, why not get rid of them completely? Yes, it isn't very tricky. To completely get rid of those product search URLs, which normally need to be hidden from indexing one needs to rethink the product search engine and transfer the product search procedure from GET-queries to POST. The product search without useless URLs looks like a form with POST-querying button. After querying the search results will be written into the session and redirected to the user. Doing so one achieves, that the search result page's URL lives exactly so long like the use session and becomes not indexed. This modus operandi saves from the headache caused by getting product search URLs not indexed and all seems to be OK... But... one little, or not so little issue remains. If user tries to bookmark a product search result page, or, if product configurator at work, a configured product, revisiting of such bookmark will fail, cause of no longer valid session.

No bookmark → no revisit → no conversion!
Each time somebody clicks on a no longer valid bookmark a pigeon dies!
Finally each ecommerce SEO makes individual decision based on the own e-shop data, whether to use this solution: some e-shops convert through bookmarks, another e-shops - not.

How to build product search, filter or configurator correctly and not to be harmed by negative SEO impact of useless URLs

If we can't get rid of these URLs completely, so we must learn, how to minimize their negative SEO impact , like stealing CPU capacity through useless crawler workload, duplicated content, splitting link authority, eating out your crawl budget etc.

One method to deindex them all?

As i mentioned at the article beginning, no one of the mostly suggested deindexing methods doesn't do the trick alone. These are the causes:
  • noindex and canonical - URLs containing noindex and canonical are crawled nevertheless
  • nofollow - URLs containing nofollow don't inherit PR inside of the site. Beside of this, it is enough to get a one single incoming link to get an URL into index and make it still crawled against the primal intention
  • javascript based product filter - Google understands javascript. Some weeks after masking of product search URLs with javascript, they will crawled and appear in index again.
While the panacea against a bunch of product search and options URLs doesn't exist, there is one method combination, which i strictly recommend to use:

Combined workaround to turn off the negative SEO impact of useless URLs

Actions and examples for product search URLs:
  • Facet search (category) URLs: should be static, canonical, indexable, dofollow, allowed for robots
  • Filter / options URLs: should contain canonical tag addressed to URLs like above (without query string) and be non-indexable, nofollow, disallowed for robots, disabled for indexing in Google Webmaster Tools through excluding of URLs containing query string.

How to utilize subdomains to optimize the handling of product search and filter URLs

Try to move all URLs generated by product search or filter to a subdomain:
  • create a subdomain http://search.domain.tld
  • create a list of your search options
  • add to your .htaccess following rule (in the line 2 you list your search options, like they appear in URLs above:):
    RewriteCond %{HTTP_HOST} =domain.tld [NC]
    RewriteCond %{QUERY_STRING} (sort|color|list|your|options)
    RewriteRule ^$ http://search.domain.tld/?%{QUERY_STRING} [R=301,L]
  • Adjust in Google Webmaster Tools the crawling intensity for the subdomain to lowest and the main domain to the highest.
Results: all product search or filter URLs, which contain any query string, will become rewritten from the main domain to the subdomain. On this way the bunch of unusual URLs will be swapped to another host and will not disturb the crawling of the main site. As a positive side effect the outsourced URLs will no longer eat out the crawl budget of the main host.
With .htaccess you can add to your product search URLs all needful rules for robots and canonicalization. Read .htaccess rules for SEO.

Conclusion / tl;dr

It isn't a good idea to get rid of product search, filter or options URLs completely, cause of impossibility to bookmark them. The only correct way to handle such URLs is to separately handle facet and filter / option URLs.
  • Facet URLs become static and indexable,
  • Filter URLs become excluded from any contact with crawler through robots.txt rules, WMT-parameterization and canonicalization.
Further reading