BWE Archive
BWE Archive
I did some BWE/MSN archive testing last night with a program called HTTrack. Took about 5 minutes to build up ~15 Megabytes of files. Looks like it's pretty thorough in vacuuming everything up. Unfortunately, I'm not sure how usable it will be once I complete the "backup". It will be an exact copy of the entire site, so I'm not sure what the legal situation would be just re-posting it. We'll probably have to do some additional processing after archive is created to strip the content out of the MSN layout and into something more defensible. Not impossible, but not trivial either.
- JimHow
- Posts: 20672
- Joined: Thu Nov 20, 2008 10:49 pm
- Location: Lewiston, Maine, United States
- Contact:
Re: BWE Archive
I think we should put you in charge of that project, Michael, I'll do whatever you recommend. You can work with Phil, Tom, Billo, whomever you think can help. Jim
Re: BWE Archive
Was it Billo who set up the BWE search? His program does pretty much exactly what we want to do now, I would think he could change it so that it indexed entire posts and allowed searching/viewing without linking back to the original post.
Is he still in touch Jim?
If we can get an archive of the site in a database I could set it up here so we can use it.
One other avenue is a program I've been working on in my spare time which indexes websites, it's still in its early stages but could be modified to index BWE with a bit of work.
Is he still in touch Jim?
If we can get an archive of the site in a database I could set it up here so we can use it.
One other avenue is a program I've been working on in my spare time which indexes websites, it's still in its early stages but could be modified to index BWE with a bit of work.
- JimHow
- Posts: 20672
- Joined: Thu Nov 20, 2008 10:49 pm
- Location: Lewiston, Maine, United States
- Contact:
Re: BWE Archive
Yes Billo is still a lurker on BWE.
He posted a message a few weeks back giving his email and asking that you contact him Phil. I'll try to find that message, it was in the thread I started announcing that MSN Groups was closing.
He posted a message a few weeks back giving his email and asking that you contact him Phil. I'll try to find that message, it was in the thread I started announcing that MSN Groups was closing.
Re: BWE Archive
I've just checked my emails and I sent you an email asking for his email address in December, did you get that?
- JimHow
- Posts: 20672
- Joined: Thu Nov 20, 2008 10:49 pm
- Location: Lewiston, Maine, United States
- Contact:
Re: BWE Archive
I found the message where he asked me to ask you to contact him by email, but he didn't leave his email. His most recent screenname was agentbillo, but he no longer seems to have a profile.
Re: BWE Archive
I've just found his website from the BWE search and contacted him via that, I'll let you know what he says.
Re: BWE Archive
I'm not sure what format Billo has the archive in, but there's a fair chance it'll be usable. I worked some more with the archive program this weekend and I'm not sure it's going to do it. It gets too many of the wrong files and take a long time. A while back I wrote my own Java-based thread crawler to find TN threads, so I'll see if I can use that to make an archive. The advantage there would be I can do the archiving and processing into a new format all at once.
Re: BWE Archive
I've contacted Bill and I think he's going to change his BWE search site so it returns full posts instead of links to MSN. He also should be sending me the "directory tree" so I can access the raw data to see if I can do anything with it. I say "think" because I've not heard back from him since last night, but it sounded like he'd be OK to do that.
I'd hold off on trying anything for now Michael while I see what comes of this. Thanks.
I'd hold off on trying anything for now Michael while I see what comes of this. Thanks.
Re: BWE Archive
Ignore me if I'm not making sense here - you've probably forgotten more about web stuff than I will ever know.
Wont the full posts disappear with MSN communities?
I kind of like the abbreviated results - makes it easly to search through a long list without slogging through a bunch of posts.
Wont the full posts disappear with MSN communities?
I kind of like the abbreviated results - makes it easly to search through a long list without slogging through a bunch of posts.
Re: BWE Archive
The posts have been copied by Bill's system so they'll survive the demise of MSN Groups. It's just that currently his site returns search results that link to the original post, so i'm asking him to change it so it shows the entire post so it's not necessary to go to MSN to view it (which will of course be impossible soon).
Re: BWE Archive
Aha! Where's the light bulb smilie?
- JimHow
- Posts: 20672
- Joined: Thu Nov 20, 2008 10:49 pm
- Location: Lewiston, Maine, United States
- Contact:
Re: BWE Archive
That's obviously going to be awesome if we can save the nearly 400,000 messages in the BWE community currently on MSN!
- Joe Belmaati
- Posts: 7
- Joined: Mon Jan 26, 2009 9:28 pm
- Contact:
Re: BWE Archive
@Tech guys: You definitely want to tap into Billo's mirror of BWE. He has already written a program that dumps the data into phpBB format. You can then use the search function right here in the forum software, and it will allow you to search up, down, left right, in circular motion and...well, you get the picture. I commend the tech guys on going with a phpBB installation. It's amazing what can be had for nothing.
Re: BWE Archive
My fingers are crossed that Billo's work has everything backed up already. However, since we'll loose the opportunity sometime in February, I'll continue my efforts as a backup plan.
Re: BWE Archive
I've been more impressed with phpBB than any other piece of open-source software I've used. Really powerful, usable and with lovely well-written and well-commented code.Joe Belmaati wrote:I commend the tech guys on going with a phpBB installation. It's amazing what can be had for nothing.
Who is online
Users browsing this forum: Google [Bot] and 77 guests