Current issues with 7+ boinc client

Message boards : Number crunching : Current issues with 7+ boinc client

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7

AuthorMessage
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 25,910,661
RAC: 11,581
Message 74824 - Posted: 2 Jan 2013, 14:42:45 UTC

2 mikey
Hmm. Seems both computer runs just fine now. > 50 success WUs each and no errors.
These computers also have a 100% error rate before?
ID: 74824 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,156,146
RAC: 3,952
Message 74825 - Posted: 2 Jan 2013, 15:15:51 UTC - in response to Message 74824.  

2 mikey
Hmm. Seems both computer runs just fine now. > 50 success WUs each and no errors.
These computers also have a 100% error rate before?


Yes YOU ARE CORRECT, something is different but I do not know what, but it IS working now!! I DO have a pc with an Nvidia card and in a week or so will bring it over for a test. That should confirm or refute your idea that it is an Nvidia gpu card that is causing the problems. So far you have been 100% correct!!
ID: 74825 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 74840 - Posted: 4 Jan 2013, 23:07:25 UTC

Any news on this bug? Or do you still have to strip your PC from it's GPU in order to crunch here?
ID: 74840 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 25,910,661
RAC: 11,581
Message 74843 - Posted: 5 Jan 2013, 0:53:30 UTC
Last modified: 5 Jan 2013, 0:54:07 UTC

2 Chilean

Yes, new info: Revert to old GPU drivers may help too.
Read https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6163 for more details.
ID: 74843 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,183,435
RAC: 10,025
Message 76438 - Posted: 17 Feb 2014, 6:43:15 UTC
Last modified: 17 Feb 2014, 6:49:32 UTC

Not sure where to post this, but I hope you can help.

I just upgraded Boinc from v7.2.33 to v7.2.39 and just as it was installing a task was completing and managed to get itself stuck at the uploading stage, as shown in this murky image

Stuck Task

I've tried updating and even aborting it, but it doesn't seem to want to move on, so that instead of 8 tasks I'm only running 7.

[Edit: Oops, I am running 8 tasks, but the stuck task is still stuck]

Anyone have an idea how to fix this?
ID: 76438 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,183,435
RAC: 10,025
Message 76449 - Posted: 19 Feb 2014, 2:42:29 UTC

Me again...

Now I have another issue. I haven't been able to download from Rosetta for 39 hours due to "some task is suspended via Manager" as shown here.

19/02/2014 01:01:39 | rosetta@home | update requested by user
19/02/2014 01:01:41 | rosetta@home | Sending scheduler request: Requested by user.
19/02/2014 01:01:41 | rosetta@home | Not requesting tasks: some task is suspended via Manager
19/02/2014 01:10:39 | rosetta@home | update requested by user
19/02/2014 01:10:41 | rosetta@home | Sending scheduler request: Requested by user.
19/02/2014 01:10:41 | rosetta@home | Not requesting tasks: some task is suspended via Manager
19/02/2014 02:21:36 | rosetta@home | update requested by user
19/02/2014 02:21:38 | rosetta@home | Sending scheduler request: Requested by user.
19/02/2014 02:21:38 | rosetta@home | Not requesting tasks: some task is suspended via Manager
19/02/2014 02:21:40 | rosetta@home | Scheduler request completed


Could this be some corruption connected with the task that's still stuck at the uploading stage or something else? For what it's worth, I'm still able to download tasks from WCG - 29 have just come down to fill my buffer.

Any ideas at all?

Note also, this is only happening on my desktop machine, not my laptop running under the same name.
ID: 76449 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 76452 - Posted: 19 Feb 2014, 13:19:51 UTC - in response to Message 76438.  

Not sure where to post this, but I hope you can help.

I just upgraded Boinc from v7.2.33 to v7.2.39 and just as it was installing a task was completing and managed to get itself stuck at the uploading stage, as shown in this murky image

Stuck Task

I've tried updating and even aborting it, but it doesn't seem to want to move on, so that instead of 8 tasks I'm only running 7.

[Edit: Oops, I am running 8 tasks, but the stuck task is still stuck]

Anyone have an idea how to fix this?


Hi Sid, is the task still showing up in the transfers tab? When you tried aborting it, was that from the task tab or the transfers tab?

As for the "some task is suspended via Manager" message I assume you double checked the resume/suspend button is showing as "suspend" for all the rosetta tasks and, after that, shut down and restarted BOINC to see if that would reset any errant instructions.

I do have some vague memory of having to hunt down an orphaned task in a similar situation but I think that was a case of BOINC hanging on to a task it had in fact uploaded. It involved editing the state file though so hopefully your task won't require that.

Best, Snags
ID: 76452 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,183,435
RAC: 10,025
Message 76453 - Posted: 19 Feb 2014, 17:30:48 UTC - in response to Message 76452.  

Not sure where to post this, but I hope you can help.

I just upgraded Boinc from v7.2.33 to v7.2.39 and just as it was installing a task was completing and managed to get itself stuck at the uploading stage, as shown in this murky image

Stuck Task

I've tried updating and even aborting it, but it doesn't seem to want to move on, so that instead of 8 tasks I'm only running 7.

[Edit: Oops, I am running 8 tasks, but the stuck task is still stuck]

Anyone have an idea how to fix this?

Hi Sid, is the task still showing up in the transfers tab? When you tried aborting it, was that from the task tab or the transfers tab?

In the Tasks tab. I never spotted anything under the Transfers tab as I closedminimised Boinc Manager when I did the update. I don't explicitly know if it went through or not. It just appeared as uploading in the Tasks tab when I opened 7.2.39

As for the "some task is suspended via Manager" message I assume you double checked the resume/suspend button is showing as "suspend" for all the rosetta tasks and, after that, shut down and restarted BOINC to see if that would reset any errant instructions.

Yes, and I rebooted the computer for good measure. Still stuck :(

I do have some vague memory of having to hunt down an orphaned task in a similar situation but I think that was a case of BOINC hanging on to a task it had in fact uploaded. It involved editing the state file though so hopefully your task won't require that.

I suspect I'm in that territory now. I'm just in the process of clearing down my remaining Rosetta tasks before resetting the project. If that doesn't work I'll come back and ask how to do what you've suggested. Probably from Sunday when I return home as I'm away from tonight until then.

Thanks for the suggestions.
ID: 76453 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2122
Credit: 41,183,435
RAC: 10,025
Message 76454 - Posted: 19 Feb 2014, 19:06:07 UTC
Last modified: 19 Feb 2014, 19:24:38 UTC

Ok, I think resetting the project has done the trick - the task stuck at uploading disappeared. No Rosetta tasks came down yet, but from the look of this they should appear shortly:

19/02/2014 18:53:51 | rosetta@home | Computation for task cendecoy_2NS0A_1_abinitio_SAVE_ALL_OUT_140821_282_0 finished
19/02/2014 18:53:53 | rosetta@home | Started upload of cendecoy_2NS0A_1_abinitio_SAVE_ALL_OUT_140821_282_0_0
19/02/2014 18:53:58 | rosetta@home | Finished upload of cendecoy_2NS0A_1_abinitio_SAVE_ALL_OUT_140821_282_0_0
19/02/2014 18:54:00 | rosetta@home | Sending scheduler request: To report completed tasks.
19/02/2014 18:54:00 | rosetta@home | Reporting 1 completed tasks
19/02/2014 18:54:00 | rosetta@home | Not requesting tasks: some task is suspended via Manager
19/02/2014 18:54:03 | rosetta@home | Scheduler request completed

19/02/2014 18:58:12 | rosetta@home | Resetting project
19/02/2014 18:58:17 | rosetta@home | Master file download succeeded
19/02/2014 18:58:22 | rosetta@home | Sending scheduler request: To fetch work.
19/02/2014 18:58:22 | rosetta@home | Requesting new tasks for CPU and NVIDIA
19/02/2014 18:58:24 | rosetta@home | Scheduler request completed: got 0 new tasks
19/02/2014 18:58:24 | rosetta@home | No work sent

No mention of "Not requesting tasks: some task is suspended via Manager" after the reset

Edit: And confirmed. Panic over
ID: 76454 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 2 Apr 14
Posts: 282
Credit: 208,966
RAC: 0
Message 76858 - Posted: 21 Jun 2014, 16:44:39 UTC - in response to Message 74024.  

i'm using boinc-client 7.0.36 x86_64 on opensuse linux
running R@H on a i7 4771 intel cpu (no opencl)
i'd just like to say that running R@H jobs on intel recent haswell cpus seemed to do pretty well. R@H is quite heavy in terms of resources i've seen about 3gigs of disk usage for 8 concurrent jobs (it gets cleaned up after that), about 400 megs of virtual mem per job. it seemed to be somewhat network bandwidth heavy as well the downloads are perhaps say 2-10 megs per job. not sure about the uploads though.
i'd think these resources are decent for the effort given that we are after all solving complex and large problems for each job. these are not simple 'easy' molecules after all.

all in it's still a worthwhile endeavor to participate
what may be suggested due to the resource requirements may be that if you are keen to run R@H, equip the PC with lots of *RAM* (i've installed some 16 Gigs), have adequate spare disk space e.g. allow some 3-5 Gigs for R@H (this is less of a problem these days given the average sized low costs disks easily come in Terabyte sizes ) and a *fast* cpu :o :p :D

i tend to run it during the 'lull' periods e.g. at night. hence, u may like to invest in 'quiet' cpu cooling solutions. i used one of those aftermarket 'tower' cpu coolers from coolermaster e.g. http://www.coolermaster.com/cooling/hyper-series/hyper-212-evo/. there are also 'water cooling' solutions which i find rather expensive and i'm concerned about more maintenance issues. but the 'water coolers' could help in space constrained cases.

--------------
i'm not familar with open CL setups, however. i once tried litecoin mining and noted errors when graphic cards are heavily loaded. this caused various mining threads to fail (errors). hence open CL users on 7.2 and higher clients may like to examine such characteristics if it is a reason for affecting the stability of the jobs

cpu based runs for R@H is apparently pretty stable and the jobs downloaded mostly runs end to end with no errors and are submitted

just 2 cents comments
ID: 76858 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7

Message boards : Number crunching : Current issues with 7+ boinc client



©2024 University of Washington
https://www.bakerlab.org