Message boards : Number crunching : Aborted work and a lot of wasted time - again
Author | Message |
---|---|
walli Send message Joined: 4 Nov 12 Posts: 5 Credit: 14,692,692 RAC: 0 |
Hi guys, I just noticed that hundreds of my work units were aborted by the server yesterday, and I'm talking not only about not-yet-started ones, but tasks which already ran for hours. Altogether I lost about (2816696.11 seconds) 3.26 days of work/runtime! I understand if you don't need the results anymore (they were *all* "aborted by project - no longer usable"), but then please go at least for some credit-compensation in the future... Thanks, walli |
bartonius Send message Joined: 4 Apr 20 Posts: 1 Credit: 70,262 RAC: 0 |
It was the same for me, although I had a lot less aborted WUs. We donate our CPU Time and actually also some money with our electrical bills and then the server is just aborting the work and is'nt even rewarding credits for the work done. I stopped calculating for Rosetta now, thats not how you can treat your users. |
Admin Project administrator Send message Joined: 1 Jul 05 Posts: 4805 Credit: 0 RAC: 0 |
We used to grant claimed credit to all canceled, past deadline, and invalid results and I see no reason why we shouldn't continue doing this. I think this task was lost after our last server update. I just restarted it and it will run every 6 hours. |
walli Send message Joined: 4 Nov 12 Posts: 5 Credit: 14,692,692 RAC: 0 |
Hi guys, We used to grant claimed credit to all canceled Nope, see above or the other big thread where this topic is currently on discussion - people got no credits for the server-side aborted tasks... https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13852 past deadline Yes and No. I had a bunch of tasks about 2-3 weeks ago ("rb*" resp. "*robetta*"), which got no credits at all as soon as they were past the deadline even for a single second. There was no other info/hint in the stderr-log which leads to any other assumption but the deadline. and invalid results I'm pretty sure that this ist not the case, but I have no work units/tasks to look at atm to confirm this because the results are purged very fast. I think this task was lost after our last server update. Which "task"? Do you speak of a setting or something like a cronjob for this credit-problem...? Thanks for your reply and best regards, walli |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1699 Credit: 18,186,917 RAC: 24,275 |
Hi guys,Double check what was posted there- We used to grant claimed credit to all canceled Grant Darwin NT |
walli Send message Joined: 4 Nov 12 Posts: 5 Credit: 14,692,692 RAC: 0 |
My bad, apologies. :) |
JohnDK Send message Joined: 6 Apr 20 Posts: 33 Credit: 2,390,240 RAC: 0 |
Very annoying, 4 WUs cancelled, all running over 50.000 secs. Is it me or what? https://boinc.bakerlab.org/rosetta/results.php?hostid=4063805&offset=0&show_names=0&state=6&appid= |
Admin Project administrator Send message Joined: 1 Jul 05 Posts: 4805 Credit: 0 RAC: 0 |
Please let us know if the claimed credit is not granted within a 6 hour time period. I added the credit granting task as a cron job that runs every 6 hours. I have confirmed that it is working. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2130 Credit: 41,424,155 RAC: 16,102 |
Very annoying, 4 WUs cancelled, all running over 50.000 secs. Is it me or what? Not you - it's the size of the upload file again And none of your runtimes were unreasonably long either <error_code>-131 (file size too big)</error_code> |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2130 Credit: 41,424,155 RAC: 16,102 |
Very annoying, 4 WUs cancelled, all running over 50.000 secs. Is it me or what? Credited, I now notice. Sounds like that clean-up job caught up with it. Good news |
Message boards :
Number crunching :
Aborted work and a lot of wasted time - again
©2024 University of Washington
https://www.bakerlab.org