This can be an unintentional issue, because of site architecture etc., or something the webmaster is well aware of, such as syndicated contend and scraped/illegally copied content from other websites.
Google even has a tool in its Google Webmaster Central application for webmasters to specify which of the two possible versions of the URL is the primary one; the one with or the one without "www.". The webmaster does not have to make any changes to a site code for that. He should, though, because other search engines do not provide a mechanism like Google.
Google, Yahoo and Microsoft introduced in February 2009 a new HTML attribute value that webmasters can use to reduce the issue of content duplication caused by canonical URLs to the same page on their website. It is called the "canonical tag".
See the details about the tag at my page:
- Duplicate Content and Near Duplicate Content - Canonical URLs, Content Theft (Scraping).
The solution for this is to specify one of the two versions as the "primary one" and 301 redirect requests to the other. You can accomplish this by code within the web site application or by using special ISAPI filters, such as Helicon's "ISAPI Rewrite" on Microsoft IIS web servers or specifying URL rewrites (mod rewrite) in the ."htaccess" file on Apache web servers.
Bigger concerns are things when it comes to pages with the same content but more than one URL because of the sites architecture. There are various reasons for causing duplicate URLs for the same content. I recommend consulting with a SEO firm for an evaluation of your site, if you suspect duplicate content on your own website.
Scraper sites are sites that are thrown together as quickly and as automated as possible to either rank well directly or get users to click on contextual Ads like Google AdSense and generate revenue. The chances are high, because the Ads are the only text that often makes some sense compared to the gibberish produced by the scraper.
Another goal of a scraper site could also be to boost indirectly the ranking of a more hidden site. The scraper site simply links to that other web site from multiple pages.
Those sites are a bad experience for the user in most cases and Search Engines try the get rid of them in their index as good as they can. Because of this struggle, become legit webmasters more often a victim of the circumstances than they should what increases fear and mistrust between search engines and webmasters.
You can download the details of the federal Digital Millennium Copyright Act (DMCA) at the following URL at Loc.gov: http://www.loc.gov/copyright/legislation/dmca.pdf
|< previous Article/Stub||<< Index||next Article/Stub >|
|Search Engine Marketing and SEO Training and Certification||Articles Index||Web Analytics To Measure Your Success!|
©2006-2021 Carsten Cumbrowski
Replication of this Content in full or in part without written permission by the author is prohibited.