I'm geting lots of errors with Rosetta v4.07

Author	Message
Simplex0 Send message Joined: 13 Jun 18 Posts: 14 Credit: 1,714,717 RAC: 0	Message 89262 - Posted: 12 Jul 2018, 16:26:25 UTC Last modified: 12 Jul 2018, 17:01:55 UTC I think I will abort all v4.07 from now on, this is how the Stderr logg looks Stderr logg <core_client_version>7.10.2</core_client_version> <![CDATA[ <stderr_txt> command: projects/boinc.bakerlab.org_rosetta/rosetta_4.07_windows_intelx86.exe @T1000.flags -in:file:boinc_wu_zip T1000.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1202697 Starting watchdog... Watchdog active. BOINC:: CPU time: 22193.3s, 14400s + 7200s[2018- 7-12 18:11:32:] :: BOINC WARNING! cannot get file size for default.out.gz: could not open file. Output exists: default.out.gz Size: -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 ====================================================== DONE :: 1 starting structures 22194.4 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== 18:11:32 (10080): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_5924_0_r2041150700_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]> Seams to be only this type of workunit Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_5951_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_5959_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_5971_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_5972_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_5991_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_4411_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_4297_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_4122_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_4081_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_3741_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_3374_0 Namn T1000_full_aivan_SAVE_ALL_OUT_03_09_677708_2579_0 ID: 89262 · Rating: 0 · rate: / Reply Quote

Simplex0 Send message Joined: 13 Jun 18 Posts: 14 Credit: 1,714,717 RAC: 0	Message 89277 - Posted: 14 Jul 2018, 12:38:58 UTC Ones again I got avian tasks that have been running for 2,5 hours and are estimated to run for 3 - 4 hours more despite that my settings in Rosetta for Target CPU run time is 2 hours. Should I abort them? I have aborted all other avian tasks as my experience is that they are running for a long time an all end up with an error. ID: 89277 · Rating: 0 · rate: / Reply Quote

Simplex0 Send message Joined: 13 Jun 18 Posts: 14 Credit: 1,714,717 RAC: 0	Message 89279 - Posted: 14 Jul 2018, 16:41:42 UTC Yupp! Same error as always, 4 work units and in total 20 hours of wasted computing, luckily I aborted all the other avian workuntis before they started running and wasted even more recourses. Stderr logg <core_client_version>7.10.2</core_client_version> <![CDATA[ <stderr_txt> command: projects/boinc.bakerlab.org_rosetta/rosetta_4.07_windows_intelx86.exe @T1000.3.flags -in:file:boinc_wu_zip T1000.3.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3683498 Starting watchdog... Watchdog active. BOINC:: CPU time: 22129s, 14400s + 7200s[2018- 7-14 18:20:52:] :: BOINC WARNING! cannot get file size for default.out.gz: could not open file. Output exists: default.out.gz Size: -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 ====================================================== DONE :: 1 starting structures 22129 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== 18:20:52 (10344): called boinc_finish(0) </stderr_txt> ID: 89279 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2129 Credit: 12,458,800 RAC: 2,329	Message 89281 - Posted: 14 Jul 2018, 20:32:57 UTC - in response to Message 89279. WARNING! cannot get file size for default.out.gz: could not open file. Output exists: default.out.gz Size: -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 ====================================================== DONE :: 1 starting structures 22129 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== 18:20:52 (10344): called boinc_finish(0) </stderr_txt> +1 Same error on all my T1000_aivan ID: 89281 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 89295 - Posted: 16 Jul 2018, 0:09:53 UTC I have no details on the specific WUs or issues they are having. But I wanted everyone to know that BOINC Manager's "estimated runtime" is really based on history, not the present. So, regardless of the name or likely success of current WUs, if BOINC Manager has a recent history with WUs taking 3 or 4 hours longer than the runtime preference, it will "estimate" future WUs will take 3 to 4 hours longer as well. The likelihood of the current WUs running long is not related to the estimated runtime of the BOINC Manager. If the name of the current tasks has the same prefix as those that you had trouble with, that would be a better indicator for you. Rosetta Moderator: Mod.Sense ID: 89295 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2129 Credit: 12,458,800 RAC: 2,329	Message 89298 - Posted: 16 Jul 2018, 8:38:36 UTC - in response to Message 89295. I have no details on the specific WUs or issues they are having. But I wanted everyone to know that BOINC Manager's "estimated runtime" is really based on history, not the present. So, regardless of the name or likely success of current WUs, if BOINC Manager has a recent history with WUs taking 3 or 4 hours longer than the runtime preference, it will "estimate" future WUs will take 3 to 4 hours longer as well. For me the problem is not the runtime of wus (i know the decoy's question), but the validation error. ID: 89298 · Rating: 0 · rate: / Reply Quote

rjs5 Send message Joined: 22 Nov 10 Posts: 274 Credit: 23,730,845 RAC: 0	Message 89301 - Posted: 16 Jul 2018, 18:13:18 UTC - in response to Message 89298. I have no details on the specific WUs or issues they are having. But I wanted everyone to know that BOINC Manager's "estimated runtime" is really based on history, not the present. So, regardless of the name or likely success of current WUs, if BOINC Manager has a recent history with WUs taking 3 or 4 hours longer than the runtime preference, it will "estimate" future WUs will take 3 to 4 hours longer as well. For me the problem is not the runtime of wus (i know the decoy's question), but the validation error. I found 1 "Invalid" run that you were granted 587.11 credits. Were there others that were a problem for you? Seems like the 587 credits were similar to the other valid jobs. name T1000_full3_aivan_SAVE_ALL_OUT_03_09_677955_5874 application Rosetta created 14 Jul 2018, 8:17:20 UTC canonical result 1015258491 granted credit 587.11 https://boinc.bakerlab.org/workunit.php?wuid=914821393 ID: 89301 · Rating: 0 · rate: / Reply Quote

Simplex0 Send message Joined: 13 Jun 18 Posts: 14 Credit: 1,714,717 RAC: 0	Message 89302 - Posted: 16 Jul 2018, 21:57:03 UTC - in response to Message 89301. I have no details on the specific WUs or issues they are having. But I wanted everyone to know that BOINC Manager's "estimated runtime" is really based on history, not the present. So, regardless of the name or likely success of current WUs, if BOINC Manager has a recent history with WUs taking 3 or 4 hours longer than the runtime preference, it will "estimate" future WUs will take 3 to 4 hours longer as well. For me the problem is not the runtime of wus (i know the decoy's question), but the validation error. I found 1 "Invalid" run that you were granted 587.11 credits. Were there others that were a problem for you? Seems like the 587 credits were similar to the other valid jobs. name T1000_full3_aivan_SAVE_ALL_OUT_03_09_677955_5874 application Rosetta created 14 Jul 2018, 8:17:20 UTC canonical result 1015258491 granted credit 587.11 https://boinc.bakerlab.org/workunit.php?wuid=914821393 The work units that was marked as 'Invali' is in my first post in this thread and I had 5 - 6 more of the same later. Why you can't find them I have no idea, ask the staff, maybe they can help you. The lates work units of this kind is here.... https://boinc.bakerlab.org/result.php?resultid=1015234868 https://boinc.bakerlab.org/result.php?resultid=1015234886 https://boinc.bakerlab.org/result.php?resultid=1015234755 https://boinc.bakerlab.org/result.php?resultid=1015234804 They was the first units I run in both my fist and second attempt to crunch a bunch of maybe 100 work units but because the 4 first I run all ended up as 'Invalid' I aborted all the others. I have not received any more of this "avian" workunits lately and I hope I wont. The credit is totally irrelevant in this case, the problem imo is that recourses are wasted when hours of crunching ends up with a result that is Invalid. Luckily I spotted the early and wasted only 40 hours instead of 500 hours. ID: 89302 · Rating: 0 · rate: / Reply Quote

Simplex0 Send message Joined: 13 Jun 18 Posts: 14 Credit: 1,714,717 RAC: 0	Message 89303 - Posted: 16 Jul 2018, 22:09:25 UTC - in response to Message 89295. Last modified: 16 Jul 2018, 22:11:32 UTC I have no details on the specific WUs or issues they are having. But I wanted everyone to know that BOINC Manager's "estimated runtime" is really based on history, not the present. So, regardless of the name or likely success of current WUs, if BOINC Manager has a recent history with WUs taking 3 or 4 hours longer than the runtime preference, it will "estimate" future WUs will take 3 to 4 hours longer as well. The likelihood of the current WUs running long is not related to the estimated runtime of the BOINC Manager. If the name of the current tasks has the same prefix as those that you had trouble with, that would be a better indicator for you. The main issue here is not the runtime or credit in this case, it is that a lot of your crunchers resources and YOUR resorses I wasted when a lot of hours of crunching ends up with a result that is invald. ID: 89303 · Rating: 0 · rate: / Reply Quote

Simplex0 Send message Joined: 13 Jun 18 Posts: 14 Credit: 1,714,717 RAC: 0	Message 89304 - Posted: 17 Jul 2018, 4:41:03 UTC - in response to Message 89295. I have no details on the specific WUs or issues they are having. But I wanted everyone to know that BOINC Manager's "estimated runtime" is really based on history, not the present. So, regardless of the name or likely success of current WUs, if BOINC Manager has a recent history with WUs taking 3 or 4 hours longer than the runtime preference, it will "estimate" future WUs will take 3 to 4 hours longer as well. The likelihood of the current WUs running long is not related to the estimated runtime of the BOINC Manager. If the name of the current tasks has the same prefix as those that you had trouble with, that would be a better indicator for you. I have now checked more than 1000 workunits that has finished successfully and only 1 of them took 4 hours while ALL of the invalid aivan workuntis took more that 6 hours to finish. Anyway. It seams that I do not get any more of this kind of workunits so hopefully the problem has already been spotted and taking care of by the staff. ID: 89304 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2129 Credit: 12,458,800 RAC: 2,329	Message 89306 - Posted: 17 Jul 2018, 6:25:22 UTC - in response to Message 89301. name T1000_full3_aivan_SAVE_ALL_OUT_03_09_677955_5874 application Rosetta created 14 Jul 2018, 8:17:20 UTC canonical result 1015258491 granted credit 587.11 https://boinc.bakerlab.org/workunit.php?wuid=914821393 https://boinc.bakerlab.org/result.php?resultid=1015258491 I've got 0 credits for this wu. But this is not a problem. I killed the others "_aivan_" ID: 89306 · Rating: 0 · rate: / Reply Quote

rjs5 Send message Joined: 22 Nov 10 Posts: 274 Credit: 23,730,845 RAC: 0	Message 89313 - Posted: 17 Jul 2018, 17:15:18 UTC - in response to Message 89306. name T1000_full3_aivan_SAVE_ALL_OUT_03_09_677955_5874 application Rosetta created 14 Jul 2018, 8:17:20 UTC canonical result 1015258491 granted credit 587.11 https://boinc.bakerlab.org/workunit.php?wuid=914821393 https://boinc.bakerlab.org/result.php?resultid=1015258491 I've got 0 credits for this wu. But this is not a problem. I killed the others "_aivan_" I thought the "granted credits" were issued manually by the project staff on those WUs that run a long time, had a problem caused by Rosetta or researcher and did not issue credits. Hmmm. I only do the work for the credits. I think I will have enough to retire soon. 8-) It is good to report the bad WU like aivan jobs so Rosetta can clean out the pipeline, inform the researcher of his mistake AND then fix the bug in Rosetta. Researchers should not be able to set problem controls. The problem should be filtered by the software before the problem starts work. ID: 89313 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2129 Credit: 12,458,800 RAC: 2,329	Message 89314 - Posted: 18 Jul 2018, 5:42:18 UTC - in response to Message 89313. Hmmm. I only do the work for the credits. I think I will have enough to retire soon. 8-) I hope you will not stop to crunch on R@H and write on forum. You would miss us ID: 89314 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 89319 - Posted: 18 Jul 2018, 15:47:47 UTC - in response to Message 89313. I thought the "granted credits" were issued manually by the project staff on those WUs that run a long time, had a problem caused by Rosetta or researcher and did not issue credits. A program runs daily that grants credit to these tasks. The credit reflects the value to the Project Team of understanding what is not working, so things can improve. Rosetta Moderator: Mod.Sense ID: 89319 · Rating: 0 · rate: / Reply Quote