Errors while computing

Message boards : Number crunching : Errors while computing

To post messages, you must log in.

AuthorMessage
Stevie G

Send message
Joined: 15 Dec 18
Posts: 107
Credit: 837,101
RAC: 1,615
Message 90017 - Posted: 17 Dec 2018, 21:54:58 UTC
Last modified: 17 Dec 2018, 21:56:27 UTC

I recently signed onto Rosetta. My computer did a few quick tasks, and then ended four tasks marked "Errors while computing. What would be the cause of these errors?

In addition to Rosetta, this computer is running SETI@Home and Asteroids.

The erroneous tasks are copied below:
Task
click for details
Show names Work unit
click for details Computer Sent Time reported
or deadline
explain Status Run time
(sec) CPU time
(sec) Credit Application
1048115564 944100364 3551508 17 Dec 2018, 18:43:57 UTC 17 Dec 2018, 19:42:19 UTC Error while computing 0.00 0.00 --- Rosetta Mini v3.78
windows_intelx86
1048115417 944100228 3551508 17 Dec 2018, 18:43:57 UTC 17 Dec 2018, 19:42:19 UTC Error while computing 0.00 0.00 --- Rosetta Mini v3.78
windows_intelx86
1048029437 944021824 3551508 17 Dec 2018, 8:36:38 UTC 17 Dec 2018, 8:41:53 UTC Error while computing 0.00 0.00 --- Rosetta Mini v3.78
windows_intelx86
1047957945 943956068 3551508 16 Dec 2018, 23:37:38 UTC 17 Dec 2018, 8:36:38 UTC Error while computing 11,400.13 10,784.86 --- Rosetta Mini v3.78
windows_intelx86

I had an episode last year with S@H where there were several hundred "errors while computing" and I'd prefer not to repeat that experience. It was a waste of my time and my computer's time.

I am not particularly computer literate. I just allow my machine to crunch numbers. So I appreciate your forbearance in dealing with my stupidity.

Thanks for any insight you can provide.

Stevie G
ID: 90017 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 90018 - Posted: 18 Dec 2018, 2:57:19 UTC - in response to Message 90017.  

It could just be bad work units. Check later to see if they complete on other machines. I wouldn't get too worried yet.
ID: 90018 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 107
Credit: 837,101
RAC: 1,615
Message 90019 - Posted: 18 Dec 2018, 5:20:58 UTC - in response to Message 90018.  

It could just be bad work units. Check later to see if they complete on other machines. I wouldn't get too worried yet.



Thank you.

Since I'm the newby around here, I'll just watch and learn.

Stevie G.
ID: 90019 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 107
Credit: 837,101
RAC: 1,615
Message 90020 - Posted: 18 Dec 2018, 7:30:39 UTC - in response to Message 90019.  

It could just be bad work units. Check later to see if they complete on other machines. I wouldn't get too worried yet.



Thank you.

Since I'm the newby around here, I'll just watch and learn.

Stevie G.


So five errors while computing in two days. Is that a frequent occurrence? Lots of errors?

Stevie G.
ID: 90020 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 90021 - Posted: 18 Dec 2018, 12:39:01 UTC - in response to Message 90020.  
Last modified: 18 Dec 2018, 12:49:50 UTC

So five errors while computing in two days. Is that a frequent occurrence? Lots of errors?

No, but I did see a bad one a couple of days ago. I am on Linux, and it completed OK on Windows. Or sometimes it is the other way around.
The bad ones often come in batches. If you get a bad batch, you could see several. You will know shortly. Good luck.

Are you overclocking your machine? That is a frequent source of errors.
Also:Exclude the BOINC Data folder from your anti-virus. They can cause problems.
ID: 90021 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 107
Credit: 837,101
RAC: 1,615
Message 90023 - Posted: 18 Dec 2018, 13:37:05 UTC - in response to Message 90021.  

Jim: You wrote, "Are you overclocking your machine? That is a frequent source of errors.
Also:Exclude the BOINC Data folder from your anti-virus. They can cause problems."

Thanks for the reply.

No, not overclocking.

BOINC Data is already in my Malwarebytes exclusion file. I think that was one of the things that caused so many errors in SETI@Home last year, before I made that exclusion. BOINC2 and BOINC are also in there. So I dunno.

But I shouldn't get aheads of myself. I'll do as you advise and bide my time.

Stevie G
ID: 90023 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 90024 - Posted: 18 Dec 2018, 13:56:32 UTC - in response to Message 90023.  

The only other thing I can think of is memory. Rosetta takes more than most projects. The current work units typically run about 300 to 400 MB each, but I have one at 885 MB. In general, you should have 1 GB of free memory (that is, more than you need for the OS and any other applications) for each work unit.

(In fact, they sometimes go overboard and use up to 2 GB, but they are not supposed to now.)
ID: 90024 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
floyd

Send message
Joined: 26 Jun 14
Posts: 23
Credit: 10,268,639
RAC: 0
Message 90033 - Posted: 19 Dec 2018, 17:41:25 UTC

(Ignoring your parallel thread on the same topic)

Almost all of those errors are about missing or invalid files, and it's exe and zip files. But those files can't really be (permanently) missing or invalid, or you'd see more failing tasks. Something is likely to modify those files, move/delete them or make them inaccessible. That smells like a virus scanner. Yes, you wrote that the BOINC data directory is supposed to be off limits, but have you verified that?
Also, you mentioned "BOINC Data", "BOINC2" and "BOINC". You don't run multiple instances of BOINC on the same data directory, do you?
In any case you should closely examine your log files, both of BOINC and the virus scanner.
ID: 90033 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Errors while computing



©2024 University of Washington
https://www.bakerlab.org