Nine Years of Link Rot

Reply: Nine Years of Link Rot

Sean Conner:
I gave up on dealing with link rot years ago. If I come across an old post with non-functioning links, I may just find a new resource, link to The Wayback Machine or (if I’m getting some spammer trying to get me to fix a broken link by linking to their black-hat-SEO laiden spamfarm) removing the link outright. I don’t think it’s worth the time to fix old links, given the average lifespan of a website is 2 years and trying to automate the detection of link rot is a fools errand (a page that goes 404 or does not respond is easynow handle the case where it’s a new company running the site and all old links still go a page, but not the page that you linked to). I’m also beginning to think it’s not worth linking at all, but old habits die hard.

After about nine years of writing, I’ve concluded something similar: my existing reactive approach is not going to scale either with expanding content or over time. Fixing individual links is OK if you have only a few or aren’t going to be around too long, but as you approach tens of thousands of links over decades, the dead links build up.

So the solution I am going to implement soon is taking a tool like ArchiveBox or SinglePage and hosting my own copies of (most) external links, so they will be cached shortly after linking and can’t break. The bandwidth and space will be somewhat expensive, but it’ll save me and my readers a ton in the long run.

Always glad to hear your perspective on this. I am working on my own archival system for href.cool - I’ve already lost thewoodcutter.com and humdrum.life. (Been going for one year.)

My approach right now is to verify all of the links weekly by comparing last week’s title tag on that page to this week’s title tag. I’ve had to tweak this a bit - for PDF links or for pages that have dynamically generated titles - I can opt to use ‘content-length’ header or a meta tag. (Of course, the old title can be spoofed, so I’m going to improve my algorithm over time.)

I wish I had a way of seeding sites with you. I imagine we have some crossover on interests and would also love to contribute bandwidth - as would some of your other readers I’m sure.

PLUNDER THE ARCHIVES