Another step towards quality
GOOG
recently announced a major search engine results quality improvement specifically around webspam. This is great news for business that have
ethical online marketing practices. Webspam makes the results that searchers receive have LESS relevant results and often push down sites that offer interesting and unique content.
The goal of a typical spam website is to generate ad revenue (ironically mostly from Google AdSense) with as little effort as possible. Creating useful information is time consuming and/or costly, so often spammers take shortcuts like autoblogging. The concept of an autoblog is to have a piece of software scour the web for interesting, useful information and republish it. Some autoblogs then "spin" the content, which essentially changes the word order and substitutes some words with synonyms. To search engine spiders, it appears to be new, unique content when it all reality it was just stolen and scrambled by a machine. The real quality issues start to happen when autoblogs start scraping (aka stealing content from) other autoblogs. The result is a bunch of unreadable text surrounded by ads for tacky, spammy products like acai berry autoship offers. YUCK.
This wouldn't really be an issue if nobody ever visited these useless sites, but the webmasters that own these spam sites tend to use
search engine optimization techniques that made the websites appear on page 1 of Google and Bing search results. People would find these junk sites instead of websites that offered actual legitimate information, and the webspam webmaster would earn ad revenue while searchers and legitimate website owners would get frustrated. I remember getting a spam email about a piece of software called "WPMAGE"- it offered a way to setup 1000's of spam sites with the click of a button. From their site:
- 3 common mistakes to AVOID when selecting your domain so your site doesn't get de-indexed
- How to have unique content on your site (WITHOUT doing any writing)
- 3 "unusual" ways to create your keyword list that guarantees engaging traffic
- How to monetize your site... the fast and easy way
- 3 "affiliate laws" you must follow to avoid your site being black-listed by the search engines
This software promised that it would "literally print money from the search engines" for its owners, all for a measly $2000 (if I remember correctly). Sadly for the people who made their living by using machines to spam machines, the
recent change implemented by Google has made it much less likely for these sites to rank well in search engines.
I'm constantly intrigued by the ingenuity of search engineer's ability to systematically route out webspam efforts, and this one must have been a special challenge. The concept of "unspinning" content must be incredibly challenging to do algorithmically, and could be fraught with "false positives" - for example much of journalism is repeated information about the same event said in slightly different ways. Perhaps this quality improvement just found patterns among major spam blogs and eliminated or penalized sites that have a high confidence that they match those attributes. We'll never really know, but I know that we can rest assured that spammers won't just give up and get real jobs. I can't wait to see the machinations they come up with!