Futurama   Planet Express Employee Lounge
The Futurama Message Board

Design and Support by Can't get enough Futurama
Help Search Futurama chat Login Register

PEEL - The Futurama Message Board    The Wish Void    Peelified and Internet Archive Wayback Machine « previous next »
Author Topic: Peelified and Internet Archive Wayback Machine  (Read 668 times)
Pages: [1] Print
jabalong

Bending Unit
***
« on: 02-01-2011 04:03 »

Hi guys,

Amid the recent website downtime and loss of posts, I went over to the Internet Archive Wayback Machine to see what archives they have of old Peelified pages.

For those who don't know, the Wayback Machine is an old non-profit (I believe) attempt to keep an archive of the internet as it was over time - at least snapshots of it at different points.

Searching Peelified.com on the main page only has snapshots of the site up until 2008:

http://web.archive.org/web/*/http://peelified.com

Now they have a beta version that apparently has later archives, so I plugged Peelified in there:

http://liveweb.waybackmachine.org/http://www.peelified.com/index.php

Sadly it came up with this message:

Quote
Page cannot be crawled or displayed due to robots.txt.

See www.peelified.com robots.txt page. Learn more about robots.txt.

I don't know much about this, but does this mean there's a coding/compliance issue on Peelified's side? Maybe when the website problems are sorted out and in the interest of posterity, someone might look into how to make Peelified compliant with the Wayback Machine? Seems a shame not to have it archived there.
Gopher

Fallback Guy
Space Pope
****
« Reply #1 on: 02-01-2011 04:34 »
« Last Edit on: 02-01-2011 04:43 »

If marc explicitly setting it not to be archived is being non-compliant, then yes, peel is non-compliant. It's possible this is a side-effect of some more general anti-spam-bot settings, but I doubt it.  Certainly PEEL allows search engine bots.

:edit:hmm, after looking at robots.txt  it's not at all clear to me why waybackmachine would not be archiving peel.

On a tangental note, [-mArc-], do you have moral objections to turnitinbot, or just resent it's bandwidth abuse?

:edit2: oh, it's specifically the main index page that's blocked from robots. huh. Wouldn't that prevent search engines from being able to index the forum at all? Odd.
[-mArc-]

Administrator
Liquid Emperor
**
« Reply #2 on: 02-01-2011 08:17 »

I generally disallow www.peelified.com/index.php but not www.peelified.com/ . The reason is that I don't want bots to use the dynamic links that use index.php (topic view, searching, registering, etc.) but the static ones.

Turnitinbot is something from 2002 or something. It was eating up bandwidth back then without offering anything to us.

I'm not sure why the archive thinks that it cannot index. The only thing I don't want indexed really is the offtopic section.
totalnerduk

DOOP Ubersecretary
**
« Reply #3 on: 02-01-2011 16:42 »

Which, funnily enough, is the part that people would probably be looking for if they're searching for PEEL specifically, rather than just a Futurama site. tongue
Gopher

Fallback Guy
Space Pope
****
« Reply #4 on: 02-01-2011 17:36 »

also the part that would take up the largest slice of PEEL's bandwidth to index.
[-mArc-]

Administrator
Liquid Emperor
**
« Reply #5 on: 02-01-2011 18:06 »

Anyway. I saw what the archive's problem is and did something that may or may not fix it for the future.
jabalong

Bending Unit
***
« Reply #6 on: 02-02-2011 13:46 »

Cool, with any luck someone 1,000 years from now in New New York is browsing this page on the archive and thanking you. And if not, well it was worth a try. Thanks.

 wink
Pages: [1] Print 
« previous next »
Jump to:  

Powered by SMF | SMF © 2006, Simple Machines | some icons from famfamfam
Legal Notice & Disclaimer: "Futurama" TM and copyright FOX, its related entities and the Curiosity Company. All rights reserved. Any reproduction, duplication or distribution of these materials in any form is expressly prohibited. As a fan site, this Futurama forum, its operators, and any content on the site relating to "Futurama" are not explicitely authorized by Fox or the Curiosity Company.
Page created in 0.142 seconds with 17 queries.