Why are CASP9 tasks erroring out 50% of the time?

Message boards : Number crunching : Why are CASP9 tasks erroring out 50% of the time?

To post messages, you must log in.

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 72650 - Posted: 3 Apr 2012, 19:50:57 UTC

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?
ID: 72650 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,655,464
RAC: 11,085
Message 72652 - Posted: 3 Apr 2012, 20:11:48 UTC - in response to Message 72650.  

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?


Hi Greg

What graphics driver version are you running?
ID: 72652 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 72655 - Posted: 3 Apr 2012, 23:29:27 UTC - in response to Message 72652.  

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?


Hi Greg

What graphics driver version are you running?



Just the onboard Intel G35 chipset which is up to date.
How would graphics have anything to do with a work unit crashing while crunching on a CPU? I do not have screen saver enabled on my system. And some of these crashes happened while the system was not in use and I was asleep.
ID: 72655 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,655,464
RAC: 11,085
Message 72658 - Posted: 4 Apr 2012, 10:16:46 UTC - in response to Message 72655.  

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?


Hi Greg

What graphics driver version are you running?



Just the onboard Intel G35 chipset which is up to date.
How would graphics have anything to do with a work unit crashing while crunching on a CPU? I do not have screen saver enabled on my system. And some of these crashes happened while the system was not in use and I was asleep.


The GPU driver version seems to affect whether Rosetta is able to complete tasks successfully: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=5914

ID: 72658 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 72665 - Posted: 5 Apr 2012, 0:14:18 UTC - in response to Message 72658.  

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?


Hi Greg

What graphics driver version are you running?



Just the onboard Intel G35 chipset which is up to date.
How would graphics have anything to do with a work unit crashing while crunching on a CPU? I do not have screen saver enabled on my system. And some of these crashes happened while the system was not in use and I was asleep.


The GPU driver version seems to affect whether Rosetta is able to complete tasks successfully: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=5914


Not using GPU. Only standard on board video driver and quad core cpu.
ID: 72665 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,655,464
RAC: 11,085
Message 72670 - Posted: 5 Apr 2012, 8:20:59 UTC - in response to Message 72665.  

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?


Hi Greg

What graphics driver version are you running?



Just the onboard Intel G35 chipset which is up to date.
How would graphics have anything to do with a work unit crashing while crunching on a CPU? I do not have screen saver enabled on my system. And some of these crashes happened while the system was not in use and I was asleep.


The GPU driver version seems to affect whether Rosetta is able to complete tasks successfully: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=5914


Not using GPU. Only standard on board video driver and quad core cpu.

From haivng skim-read that thread a few times I don't think it matters whether you're using the GPU (on other BOINC projects) or not - it sounds like there's a relationship between the video driver, BOINC and Rosetta, although I've only seen mention of Nvidia and AMD gpus.

I don't seem to be having problems on this machine and have BOINC set to 'Suspend GPU' so you could give that a go...
ID: 72670 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 72676 - Posted: 5 Apr 2012, 18:55:04 UTC - in response to Message 72670.  

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?


Hi Greg

What graphics driver version are you running?



Just the onboard Intel G35 chipset which is up to date.
How would graphics have anything to do with a work unit crashing while crunching on a CPU? I do not have screen saver enabled on my system. And some of these crashes happened while the system was not in use and I was asleep.


The GPU driver version seems to affect whether Rosetta is able to complete tasks successfully: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=5914


Not using GPU. Only standard on board video driver and quad core cpu.

From haivng skim-read that thread a few times I don't think it matters whether you're using the GPU (on other BOINC projects) or not - it sounds like there's a relationship between the video driver, BOINC and Rosetta, although I've only seen mention of Nvidia and AMD gpus.

I don't seem to be having problems on this machine and have BOINC set to 'Suspend GPU' so you could give that a go...


Anything to do with GPU is shut off.
Maybe the new version of Rosie will help with this problem.
ID: 72676 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 72678 - Posted: 5 Apr 2012, 18:57:48 UTC - in response to Message 72676.  

I shut down my overclocking program and am running a plain system now. I have other projects running, but why would they interfere with this task? They have their own cores. Right now Rosie is using 2 and the other projects are using the other 2 cores on my cpu.

It is weird that on the ones I bombed that my wingman can run them. One wingman has a Intel Core I3 and I have the Intel Core2 series. He runs Win7 and I run XP.
So why does it bomb on my system, but yet he runs it ok?


Hi Greg

What graphics driver version are you running?



Just the onboard Intel G35 chipset which is up to date.
How would graphics have anything to do with a work unit crashing while crunching on a CPU? I do not have screen saver enabled on my system. And some of these crashes happened while the system was not in use and I was asleep.


The GPU driver version seems to affect whether Rosetta is able to complete tasks successfully: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=5914


Not using GPU. Only standard on board video driver and quad core cpu.

From haivng skim-read that thread a few times I don't think it matters whether you're using the GPU (on other BOINC projects) or not - it sounds like there's a relationship between the video driver, BOINC and Rosetta, although I've only seen mention of Nvidia and AMD gpus.

I don't seem to be having problems on this machine and have BOINC set to 'Suspend GPU' so you could give that a go...


Anything to do with GPU is shut off.
Maybe the new version of Rosie will help with this problem.


Caught Roco while he was reading the 3.24 thread

My understanding is that there is an edge case on some of the runs with the very new hybridize protocol (which are mainly being sent out as CASP9 and rb_ runs) which result in numerical instability and range errors in calculations for a substantial fraction of workunits for particular protein systems. The crashes were happening somewhat randomly, so it makes sense that the next person on the same workunit could complete it fine.

The issue should hopefully be fixed in the new version of Rosetta@home we are currently testing on Ralph@home.
ID: 72678 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Why are CASP9 tasks erroring out 50% of the time?



©2024 University of Washington
https://www.bakerlab.org