We have discussed the Google Penguin algorithm update and how to recover from it, if you were hit. One major feature of the update targets unnatural link building that may result in a “link-warning letter” by Google and in lost rankings, if those issues are not resolved by the site owner.
Another type of penalty a website may incur comes in the form of a spam action for duplicate content, be it a result of intentional spamming (“black hat”), or legitimate practices (“white hat”). Google employs a filter that scans websites for duplicate content; that is to say any content a website publishes on more than one URL. Common black hat examples are someone stealing an entire website or re-publishing material from other sites.
However, the filter has also affected white hat websites, especially news outlets and similarly fast-paced online environments, like forums or social media channels. They produce similar or identical content – often temporarily – for legitimate reasons, for example by constantly updating real-time news or comment feeds evolving around breaking stories.
When that happens, the PageRank will automatically be divided across those multiple pages, which has an immediate, negative effect on your overall ranking.
How to avoid these short-term duplicate content issues? The solution is simple. Quoting Matt Cutt’s recent Webmaster Help video, rel=”canonical” redirection may be used for pages that are similar, but not identical, wherever white hat, short-term duplicate content occurs. This would appear like a change in direction as far as past Google policy is concerned.
How does it work?
By placing a piece of code on pages containing duplicate content, all versions of your page remain online, but only the Home URL for that particular content will be indexed by search engines, and any link value is passed through to that page as well. This type of value redirection is called rel=”canonical” attribute or “canonical tag”.
It ensures that future traffic to the site from search (and any link value) will point to the most relevant page instead of dividing it up across (almost) identical pages that unfolded while the content was still hot. For more detailed information, especially when NOT to use it, see Brad Miller’s article on Search Engine Watch.