I’ve just spend a nice Sunday evening looking through old messages in this Forum. It struck me that the posts in general are of very high quality, so I wonder if they are saved in any way? I can se that there’s about 2000 messages saved back in time (back until feb 2002) where they suddenly stop.
It would be a massive loss for comming Rose breeders if all the knowledge and answers are lost. I have also noticed that searches done in the messages does not give results back more than a year or two?
Just a thought
It’s a good point. Ive seen forums wiped from hacks. However, Ive never seen it done to a horticultural forum.
This link will take you from February 2003 to March 2001. Ignore the posts at the top.
I wonder if it would be possible to put all of these discussions on a CD with a search function (including the older posts on rosemania).
If anyone has suggestions on how it might be possible
to make a “backup” CD of messages on a forum, I’d like
to hear of it. I’m now the moderator of a list that had
it’s storage space abruptly reduced, and many of the older
messages were lost. Several people downloaded individual
messages, but no “backup” CD was ever produced.
Software to automate the process would be a blessing.
This forum is written in PHP, uses a mySQL database, and runs on an Apache HTTP Server. The database and PHP code are backed up frequently. It wouldn’t be difficult to put the database and PHP code on a CD, but using it at home would require you to install and run Apache, PHP, and mySQL on your home PC. Various groups have been working on making the installation of those things easier, but I don’t know of a really easy way to do it yet. If anyone knows of an easy way to do it, please let me know.
Jim, the RHA site is in basically the same situation as RC. We back up RC the same way and searching on the data itself without, for lack of a better way of putting it, the forum interface pieces just really can’t be done at this point.
There is a way to do it, at least here on the RHA - I use a free offline browsing program called HTTrack Website Copier. I just gave it a test spin, tweaking the preferences to omit any pictures and to leave out the email-to pages (with a -www.rosehybridizers.org/forum/mail.php* string), and it took just over 16 minutes to download through DSL and takes up about 30 MB on the hard drive.
Stefan, that looks like a cool option. How does it run?
Basically, you just tell the program what page to start on (like http://www.rosehybridizers.org/index.php) and then tell it what you don’t want it to download (like .jpg and .gif images), then let it rip. It runs through the site like a bot, grabbing everything it encounters through the links within the site.
All the pages are stored as html documents on your computer - the software puts everything neatly into folders according to the address of the page you started with (so in the directory I asked to have it create this copy, I have a “www.rosehybridizers.org” subfolder, with a “forum” subfolder, in which all the files sit). You can use the internal browsing links just as you normally would to jump to the next page, read posts, etc. The program creates a “front-end” index.html file, but you can bypass that and go straight to the real one.
Unfortunately, you can’t use the “search” function to actually search the stored data - it will go live and search the website online - but you can search files for keywords using whatever file managing system is on your computer (for instance, Windows Explorer can do this). Also, the names of the files aren’t such that you could deduce which numbered threads will be displayed when you click on them - although it’s possible to bookmark them in your browser once you’ve gotten there, or create some other kind of reference key if you want to jump ahead in the results. It would be nice to have server software and be able to search the messages conventionally, but for anyone looking to simply preserve the basic data and structure and look of the web site, this is a pretty quick and easy option. Trying to save the pictures could make the process much more cumbersome and take up huge amounts of space, and from what I could tell photobucket detects bots and may block attempts to download pictures stored on its site; I would recommend avoiding saving pictures altogether.
Given the database, and assuming it isnt overly complex, I could certainly dump the contents onto a CD or DVD in standard HTML. Would not need a web server then.