Page Segmentation and SEO

I know it is somewhat Le Grande Coq of me to assign myself as the winner of Tour De SEO already, but there is a reason for this. I am going to divulge my plan of attack to you all and of course to my partners in crime. I will be covering hot topics in individual posts such as:

  • Page Segmentation
  • WordPress SEO
  • WordPress Duplicate Content
  • Trends
  • Spreading the Love
  • Twittering

WordPress is a fantastic platform and happens to be my choice of simple CMS’s out there that’s free. The front end is template based and boy how important these templates have become. The Search Engine bots are simple folk but the algorithm that dissects the data and makes the assumptions is where the magic happens. The latest trick to be pulled out of the hat with the rabbits is Page Segmentation (PS).

Page Segmentation

So Page Segmentation is a form of Information Retrieval (IR) methodology where it splits up the content of a page into blocks and assigns a signal to noise ratio factor to each block. Luckily enough our WordPress Templates are naturally split up into segments to which can be seen by Humans and Robots alike either by code or design. PS can come in various forms and we’re unsure what SE’s use which, but it would be a wise idea to keep the idea of PS at the basis of all new designs or further optimising our templates.

So How Does Page Segmentation Work?

Chances are there will be a specific bot that will process all of your various templates and then assign it a value governed by the SE’s specific PS algorithm. This will then be fed back to the regular SE bots highlighting the blocks that are deemed to be a low value block (advertisement / spammy content) and what is a high value block (unique content, related post & navigation). This is also how the Search Engines can figure out if a specific block is based on advertising or where your site wide navigation sits. Depending on the quality of the content within the block, the algorithm assigns an importance value, the higher the importance the more often the block will be crawled by the normal bots.

So what do I need to know about Page Segmentation?

Well, there are four types, Fixed Length Page Segmentation (FixedPS), Dom Based Page Segmentation (DomPS), Vision Based Page Segmentation (VIPS) and finally Combined Page Segmentation (CombPS).

FixedPS

FixedPS strips out all the semantic code from the page and will remain with the raw content, so you guys can throw away what we have learnt from the recently updated SEOMoz’s Search Engine ranking factors. If this is looking at our raw copy itself, we have to make sure that our content is not only bloody fantastic, but also providing the robots a scent of subtle and natural signals. Content is king.

DomPS

DomPS is based all on the W3C’s DOM defined structure. It splits up the content by each specific element within the DOM, so <title> – <h1> – <p> and so forth and analyses each block for signals. So make sure that your template meets W3C respective standards. So if your design were based on tables – I would think again!

VIPS

VIPS splits up your website by visual queues, so you have to make sure your segments are clearly defined, visually and by code. For IR, this method is far superior to DomPS because it allows for groups of elements in a block, rather than one block per element. To benefit, you really have to make sure that you’ve got your usability A game in check here, think about your UX patterns and flow, is it consistent?

CombPS

CombPS is a combination of VIPS and FixedPS, this is simply down to the fact that the previous two methodologies can create too short of a block to evaluate – research has shown that 40% of the blocks produced contain less than 10 words. So it first processes the page utilising VIPS and then it applies FixedPS window frame on top of the larger blocks, so the search engines can pull through decent sizes of blocks to digest. So ladies and gentlemen, we not only have to produce brilliant content but we also need to provide clear visible cues.

So all in all, the technologies the Search Engines are using to retrieve the information from our sites look for signals that poor ole Vanessa Fox gets drilled for by Oilman and the infamous Greg Boser! Yeah links make up 80% of our SEO strategies but as you always know, we need to get our sites performing, so we’re not wasting all the equity we’re pulling in.

My take away from all this:

  1. Make sure content is your number one priority – give off the right signals within your content so read this article on content optimisation and read why content is king baby.
  2. So we know that the Search Engines like to use the DOM and could potentially find trouble if your sites layout is table based. Read up on valid xHTML and CSS, while you’re there you might want to read up on the W3C WCAG 1.0 / 2.0.
  3. VIPS is important not only to robots but to humans too. We look for distinction to help direct, we love patterns and to become intuitive and learn the way of the site in minutes. So check out UX patterns and this fantastic blog Smashing Magazine which covers design and usability.

So….if you’re struggling to topple your compadres at the top, you may want to re-think your SEO strategy somewhat and start looking at a few theories that are floating around here on the Internet.

So wish me luck guys!

Futher reading: Block-Based Web Search &  Page Segmentation.

Tags:

Leave a Reply