Rosetta servers down?

A range of Biological/Medical Science projects that need our help, such as: Rosetta, Ralph
Post Reply
Jeffers
Active UBT Contributor 15+ yrs
Posts: 1627
Joined: Mon Jul 24, 2006 1:00 am
Location: Halifax, West Yorks.

Rosetta servers down?

Post by Jeffers »

I've noticed that the Rosetta servers look to be down. Completed tasks are failing to upload and I get a site not available message for boinc.bakerlab.org/rosetta

Anyone know what's happening there?
Image
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

no idea whats happening mate. Rosetta have a twitter account now but don't use it for service announcements, just publicity type stuff - http://twitter.com/rosettaathome

The whole boinc.bakerlab site seems to be down.

Its annoying cos its the only project I run so I'm just wasting CPU cycles at the minute lol
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

looks like she's up and running as of sometime after midnight:
  • Thu  9 Dec 23:54:59 2010 | rosetta@home | Message from rosetta@home: Project is temporarily shut down for maintenance
    Fri 10 Dec 00:55:01 2010 | rosetta@home | Sending scheduler request: To report completed tasks.
    Fri 10 Dec 00:55:01 2010 | rosetta@home | Reporting 1 completed tasks, requesting new tasks for CPU
    Fri 10 Dec 00:55:04 2010 | rosetta@home | Scheduler request completed: got 1 new tasks
    Fri 10 Dec 00:55:06 2010 | rosetta@home | Started download of input_mem_prog_run05_centroid_round01_E_subrun_000002_yfsong.zip
Explanation from Rosetta@Home website:
News  
Dec 9, 2010
Today's heros are Keith and Darwin, our systems administrators and hardware architects. Yesterday, our main filesystem crashed hard. There were warning lights flashing behind every disk on the SAN and it looked pretty grim. Thankfully, Keith and Darwin were able to pinpoint the problem to two redundant laser modules for the fiber optic loops (it was amazing and unlucky that both failed). The laser modules have since been replaced and the filesystem is back up. We'll be starting up the project again shortly. Thanks for your patience.
Check website for more info http://boinc.bakerlab.org/rosetta/
Jeffers
Active UBT Contributor 15+ yrs
Posts: 1627
Joined: Mon Jul 24, 2006 1:00 am
Location: Halifax, West Yorks.

Post by Jeffers »

Yep, I've been able to upload the outstanding completed tasks. No new tasks yet, though....
Image
matt40k
Posts: 10
Joined: Wed Dec 29, 2010 12:00 am

Post by matt40k »

Is it down again?
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

Annoyingly its down at the minute, I'm almost out of work ....

I've started using this site to answer that age old question of 'Is it just me?'

http://www.isup.me[/quote]
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

Managed to report tasks and download new work.
matt40k
Posts: 10
Joined: Wed Dec 29, 2010 12:00 am

Post by matt40k »

Looks back up, no post about why it went down.

My stats haven't updated publically, I've got over 60k on Rosetta website, but only 40k publically. Maybe they're having a laugh with me ;)
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

It seems there are some problems with the various servers/systems that administer the Rosetta project.

http://boinc.bakerlab.org/rosetta/rah_status.php

I'll be honest, I love the Rosetta project and its goals, but every time there's  US school/public holiday it goes offline.

I imagine something along the lines of a gerbil powered server that doesn't get fed when no one is there and dies screwing the whole system, only to be replaced by another gerbil :S

As far as someone telling the users there is a problem with the equipment they never do until afterwards; "Sorry for outage, RAID Backup Widget server PSU failed" etc. They have a twitter feed but only use it for publicity and good news, it would be nice if they had engineers just posting a quick note somewhere or they just want us to check to server status link above.

Keep Crunching !!!
Jeffers
Active UBT Contributor 15+ yrs
Posts: 1627
Joined: Mon Jul 24, 2006 1:00 am
Location: Halifax, West Yorks.

Post by Jeffers »

They look to be offline again at the moment. I've got completed WUs waiting to report but no current work from them.
Image
primalsole
Posts: 70
Joined: Fri Dec 17, 2010 12:00 am

Post by primalsole »

Yes, same here. I hope they hurry up and get it sorted as they have been fairly unreliable over the last couple of weeks.
matt40k
Posts: 10
Joined: Wed Dec 29, 2010 12:00 am

Post by matt40k »

Looks like the kit also needs replacing, my works PE2850 are over 4/5yrs and have been replaced and PE2650 they use must be WAAY out of date, unless they didn't keep that webpage updated?

Shame it's down, yet again. Nice project, just needs some more love from them.
matt40k
Posts: 10
Joined: Wed Dec 29, 2010 12:00 am

Post by matt40k »

Update from R@H:

The project's fileserver has crashed. We're working to get things back online as soon as possible. Thanks for your patience. -KEL 01/06/2011

(So that's 06/01/2011)
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

I've shutdown my headless Ubuntu cruncher until they get sorted, save a few leccy pennies.

If the fileserver is being rebuilt from a backup it might be done in the morning or someone might have to run to [insert US PC World equivalent] to get some kit.
matt40k
Posts: 10
Joined: Wed Dec 29, 2010 12:00 am

Post by matt40k »

It's back :)
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

my communications to the project, via Boinc Manager, are still being deferred by 30mins ... :(

But the website is working and things would appear to be getting better.
http://boinc.bakerlab.org/rosetta/rah_status.php
mik9dt
Posts: 3
Joined: Sat Jan 08, 2011 12:00 am

Number 5 is alive!!!

Post by mik9dt »

or rather Rosetta is a full strip of green at
http://boinc.bakerlab.org/rosetta/rah_status.php

Good news... :D
UBT - Rick Horn
Posts: 17206
Joined: Sat May 06, 2006 1:00 am

Post by UBT - Rick Horn »

New work has arrived, but all WUs say "download failed".  :(
UBT - Rick Horn
Posts: 17206
Joined: Sat May 06, 2006 1:00 am

Post by UBT - Rick Horn »

New WUs are downloading and running normally.  :D
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

I'm still struggling Rick, WU's ready to report but retrying in 10hrs ... grrr
UBT - Rick Horn
Posts: 17206
Joined: Sat May 06, 2006 1:00 am

Post by UBT - Rick Horn »

Rotwang1985 wrote:I'm still struggling Rick, WU's ready to report but retrying in 10hrs ... grrr
At least things are improving slowly. I`m sure they will sort things out soon, we hope!  :roll:
Jeffers
Active UBT Contributor 15+ yrs
Posts: 1627
Joined: Mon Jul 24, 2006 1:00 am
Location: Halifax, West Yorks.

Post by Jeffers »

Still not working for me either.....
Image
Rotwang1985
Posts: 117
Joined: Sat Dec 19, 2009 12:00 am

Post by Rotwang1985 »

from the Rosetta Homepge:
Jan 7, 2010
Well, our luck ran out. The SAN controller that has been causing so much trouble in the last few months finally tipped over in a rather distructive fashion, corrupting the binary tree on which the filesystem is based. We're trying to rebuild the thing but the sheer number of files in the filesystem (> 10M files) makes this process very, very slow. We're bringing the project up from a recent backup (12/09/10) but the backup wasn't a perfect replica of the environment, so we're having to scramble to get all the parts working together again. We only need a few more weeks and then our new, next generation SAN will be ready to be put into place... I just thought the old one would last a few more week. I apologize for the hassle and appreciate your patience as we get things online again... KEL 01/07/11
http://boinc.bakerlab.org/rosetta/
UBT - Rick Horn
Posts: 17206
Joined: Sat May 06, 2006 1:00 am

Post by UBT - Rick Horn »

Just started running Rosetta after a break of nearly 5 years.
Compared to Rosetta, WCG is positively bountiful.  :roll:

Edit: Aborted the rest of them.
matt40k
Posts: 10
Joined: Wed Dec 29, 2010 12:00 am

Post by matt40k »

Still got a nice back log of files to be uploaded, not uploading, no new data either, not good :(

EDIT:
http://boinc.bakerlab.org/rosetta/forum ... 5028#62879

Result = me gutted.
UBT - Rick Horn
Posts: 17206
Joined: Sat May 06, 2006 1:00 am

Post by UBT - Rick Horn »

I had 3 WUs stuck as "uploading" for at least 4 days. I was tempted to delete them, but didn`t.
This morning, they uploaded themselves, so don`t be too disheartened, all may well be OK.
Jeffers
Active UBT Contributor 15+ yrs
Posts: 1627
Joined: Mon Jul 24, 2006 1:00 am
Location: Halifax, West Yorks.

Post by Jeffers »

They seem to be getting there, if rather slowly!
The completed WUs I had waiting have gone, but I've not had any new work from them.....
Image
matt40k
Posts: 10
Joined: Wed Dec 29, 2010 12:00 am

Post by matt40k »

Seems to have uploaded the backlog, new work coming down.
Post Reply