Expedia SEO Fail
Posted by Amrit Gill | Filed under SEO
Checking out the SERPs is so 2008 but boy I can’t stop looking even though Jill Whalen has told me to stop! [http://searchengineland.com/5-reasons-why-rankings-are-a-poor-measure-of-success-13258] However its always good to check out the landscape so to speak and keep an eye on the ole competitors.
The usual suspects were pretty much where they should be but I noticed something odd by the 3rd query. I was seeing Expedia being listed as a URL only within the SERP, so I checked a few more queries and couldn’t believe what I was seeing.
As the picture below shows, the Expedia listing for Flights to Birmingham in google.co.uk shows a URL and nothing else! This could only mean one thing, someone at Expedia thought it would be good to dick about with the robots.txt for shits n giggles.
So with a quick glance at their robots.txt it shows that they have disallowed: /index/. Now, I can understand why a company would wish to do this, and that’s normally to migrate over to a new IA or business goals have changed and a large amount of the content have now become void.
Majority of the SERPs listings are being redirected to the expedia.co.uk homepage, so you think all is well and good in this successful navigation migration? Guess again, this is perhaps one of the biggest cases of SEO Suicide I’ve ever come across!
Why you gasp? Well within expedia.co.uk/index yahoo site explorer indexed around 36,700 pages, Google indexed 7,400 pages and Bing has 9,060 pages indexed. The pages, over time would have gained equity from becoming authoritative with back links – would you advise them to simply 301 redirect them to the homepage or what?
When I first discovered the issue, Google and Bing had thousands of pages indexed but has slowly excluded them from the index as per the robots.txt. Luckily enough we can still see the reminiscent of the SEO suicide by utilising search operators within the Search Engines and Yahoo Site Explorer.
Out of the big three, Google is adhereing to the robots.txt the most, currently showing 6,600 indexed pages, Bing 8,950 indexed and Yahoo showing 36,600 pages. Yes, this post has been going on for about three weeks!
Of course, the SERPs for our query (site:www.expedia.co.uk/index/) will rank the most prominent pages towards the top and Google still has the main category pages indexed as shown in the image below.
The table below shows basic information on the category pages listed within google.co.uk. As you can see, flights & hotels.aspx still carries PageRank (even after Jan 2010 Google Toolbar update) and plenty of links pointing to the page, albeit internal but links none-the-less!
| URL | PageRank | Inbound Links (External) | Redirect |
| /index/flights.aspx | PR3 | 526k (5) | Default.aspx |
| /index/hotels.aspx | PR3 | 526K (9) | Live |
| /index/holidays.aspx | N/A | 526k (5) | Default.aspx |
| /index/attractions.aspx | N/A | 526k (0) | Default.aspx |
SEOMoz provides a rather nifty tool where it picks out the top pages within a domain based on various metrics such as MOZRank and MozTrust. The following table highlights three pages that are within the top 100 pages of expedia.co.uk.
| URL | PR | Inbound Links (Ext) | Redirect |
| /index/latvia/1/riga-hotels.aspx (9th) | PR0 | 74 (72) | Correct |
| /index/italy/1/rome-hotels.aspx (12th) | N/A | 493 (73) | Correct |
| /index/turkey/1/turkey-holidays.aspx (28th) | PR2 | 74(74) | Default.aspx |
Here, Riga Hotels, Rome Hotels and Turkey Holidays are ranked highly within the tool and all three have inbound links and the Turkey Holiday page still holds PR! The Turkey holiday page redirected straight to the default homepage but the hotel based URL’s are being redirect to the newer versions, potentially passing juice along to the newer versions.
Or are they?
In order to pass on the equity cleanly, expedia.co.uk will need to remove Disallow:/index/ from the robots.txt for all the Search Engines to pass over the equity gained. So even though Expedia have executed the 301’s to an extent for the hotel based URL’s, the equity isn’t being passed on.
So what’s the best practice for mass 301redirects you ask?
Plan, Plan and then Plan again! Simply map out the pages you wish to remove, categorise them and simply partner with a relevant page or category under the new URL structure. However, if there are no relevant page within the new structure then simply redirect the page to a relevant hubpage or the homepage.
There has been talks that you can be penalised somewhat by the ole Googlemeister for doing a mass 301 redirect but this only occurs when you’re redirecting from multiple domains. [http://www.webmasterworld.com/google/3714055.htm].
Another thing to remember, if any of the pages / categories are under some sort of a filter or penalty, this could potentially be passed on too. [http://www.webmasterworld.com/google/3662077.htm]. With Google now admitting to page speed now being treated as a ranking factor, @neyne made a valid point recently that you’ll have to be considerate about 301 redirect chains and if they impact on page load times.
*** Update ***
I started this blog post a long long time ago – I think I first discovered the problem in Nov 09! As we all know, I’m not a regular blogger so things have now changed some what. Expedia have now amended their robots.txt and removed the disallows, equity will now be passing through but only if Google still holds equity for the old pages.
If you do site:www.expedia.co.uk/index/ in google.co.uk you’ll notice that it only returns a single result, hotels.aspx (only because they didn’t 301 redirected this). Now, I’m unsure and you’d have to test this Expedia – but there are a few wives tales out there that state that Googlemeisters remembers pages even if its not listed within the index, so it would be interesting to see if the hotels based keywords saw an uplift when Expedia made the robots.txt changes.
3 Responses to “Expedia SEO Fail”
-
lewney Says:
April 21st, 2010 at 3:13 pmHey Amrit,
Glad to see you doing well now, got into SEO eh? No more webdesign ? ;)
lewney
-
Amrit Gill Says:
April 21st, 2010 at 3:29 pmHey Andrew!
Yep – got burnt and needed to secure myself a better future! Cannot give enough props for the inspiration / motivation into getting into the field. And we both know that i owe you big… We should talk :)
Amrit
-
Jon Willis Says:
September 13th, 2010 at 6:38 amWhat probably happened is they 301′d everything just fine but later added that line to robots.txt not realizing it would block Google’s access to the 301′s. Google then cannot follow the 301′s because the pages which serve them are blocked. Remove the line and the 301′s work again. Easy mistake to make really.

