Scraper Sites Save Lost Content
November 3rd, 2009 Ryan Jones
One of the caveats of using a 3rd party site to blog is that you really have no control if and when that 3rd party site ever shuts down or goes away. I shutter to think of all the content that the web lost when Geocities shut down last week, or how the web’s link graph was suddenly altered with the disappearance of the tr.im URL shortening service. These things may seem minor, but in the large scale of the web they certainly can have a significant effect.
All of the past shutdowns have changed my way of writing on the web. I now almost always double-post, or at least save a local copy of every post I write. I’ve even started using my own URL shortening service Tiny.tw so that even if I ever have to take it offline, I can still control where all the links go.
But what about content that you didn’t backup? Is it gone?
One of the first things I always tell parents to advise their kids about the internet is this: Once it’s out there, it really never goes away. Even if you upload something to one site, that’s not going to stop somebody else from using it.
Case in point: scraper sites.
I spent hours last night looking for an old post of mine that I posted on Shoutwire almost 2 years ago – but I couldn’t find it. Then, after some creative Google searches, I managed to locate the post on a made for adsense scraper site. Somebody had stripped out my name but taken my content. Normally I’d despise such a thing, but in this case it saved my ass and I was able to dig up the post.
If anything, it got me thinking about permanence, the internet, and what happens when sites you rely on disappear. It’s also made me realize that spam sites could just be a bit useful after all.
Oh, if you’re wondering what the post was, I re-posted it on my Blog here: 10 Endangered Ideas
Entry Filed under: Main
1 Comment
1. Alexander | November 9th, 2009 at 12:55 pm
This is why I don’t like URL shorteners: http://bit.ly/iP7DB