Message boards : Number crunching : Initial Estimated completion time problem.
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1681 Credit: 17,854,150 RAC: 22,647 |
After running Rosetta for over a month, with a couple of Application updates during that time, my Estimated completion time for newly downloaded work was generally within 15 minutes of my Target CPU Runtime (default, 8hrs) with an Estimated Remaining time for unstarted work of 7hrs 45min. With the new application (and so no processing history), the initial Estimated Remaining time is 4hr 30min- just over half of the actual Target CPU Runtime. This is going to result in a huge number of Tasks missing their deadlines & being re-issued, and many of them will be re-issued to other hosts that will miss their deadlines & those Tasks just going to waste. The ideal fix would be for the Estimated completion time to always be the same as the Hosts Target CPU Runtime. If tasks finish sooner, or it runs till the Watchdog timer ends it- not a problem. There is almost no chance of the Host missing deadlines* because the Estimated completion time will always start with the the most frequent actual CPU Runtime, being it's Target CPU Runtime. The next best option would be for the initial Estimated completion time for a new Application or new Host to be double the present Initial Estimated completion time. People will still get work, but there will be no chance of Tasks going to waste due to multiple missed deadlines because of the unrealistically short Estimated completion times. * Those hosts with huge cache settings & large & variable differences between CPU time & Runtime will always tend have issues with deadlines. Grant Darwin NT |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
After running Rosetta for over a month, with a couple of Application updates during that time, my Estimated completion time for newly downloaded work was generally within 15 minutes of my Target CPU Runtime (default, 8hrs) with an Estimated Remaining time for unstarted work of 7hrs 45min. Estimated CPU Time isn't always accurate - how long are they actually running? (try to run one as a test, if you haven't ran them)? |
Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,386,173 RAC: 0 |
After running Rosetta for over a month, with a couple of Application updates during that time, my Estimated completion time for newly downloaded work was generally within 15 minutes of my Target CPU Runtime (default, 8hrs) with an Estimated Remaining time for unstarted work of 7hrs 45min. It's worst if you have a really long target runtime. The initial estimated remaining time seems to always be 4.5h. With a really long target runtime, it'll also take quite a while for the estimated remaining time to recalibrate itself. The fix your suggesting seems really nice, I hopes it's possible to implement. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1681 Credit: 17,854,150 RAC: 22,647 |
Estimated CPU Time isn't always accurate - how long are they actually running? (try to run one as a test, if you haven't ran them)?Ah, from my first sentence in my initial post- After running Rosetta for over a month Yes, some finish earlier. Some finish later. Some finish a lot later. But around 90% would be finished within 5-10min of the Target CPU Runtime. Grant Darwin NT |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
Estimated CPU Time isn't always accurate - how long are they actually running? (try to run one as a test, if you haven't ran them)?Ah, from my first sentence in my initial post- I'm referring to the WUs that BOINC Manager is saying will finish in 4 hours 30 mins - not overall. I hope that clarifies what I'm referring to. What is the difference between that estimated runtime and the actual runtime? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1681 Credit: 17,854,150 RAC: 22,647 |
What is the difference between that estimated runtime and the actual runtime?I mentioned that also. 15min give or take 5-10min. my Target CPU Runtime (default, 8hrs) with an Estimated Remaining time for unstarted work of 7hrs 45min. And to cover the difference between CPU time & Runtime. eg r4k_8381_fold_SAVE_ALL_OUT_922685_1105_0 Run time 7 hours 56 min 33 sec CPU time 7 hours 56 min 1 sec Grant Darwin NT |
CIA Send message Joined: 3 May 07 Posts: 100 Credit: 21,059,812 RAC: 0 |
Do you people not have constant internet connections? Why are you all running massive caches? Just set it to .5 and be done with it. There's no reason to have more then 2x your core count in WU cache waiting to be crunched. |
Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,386,173 RAC: 0 |
Do you people not have constant internet connections? Why are you all running massive caches? Just set it to .5 and be done with it. There's no reason to have more the 2x your core count in WU cache waiting to be crunched. I've tried setting the cache to 0.5, it doesn't work if your target runtime is 36 hours and the initial estimated completion time for new apps is 4.5 hours. Every time a new version comes out, all the calibration to the estimated completion time gets reset, BOINC believes it takes 4.5 hours to complete a task, your computer gets flooded with tasks, and you are forced to either abort and/or select a saner target runtime. Setting the cache to 0.1 is the only solution for me. IF having a reasonable cache size was a good enough fix, this wouldn't be a problem at all. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1681 Credit: 17,854,150 RAC: 22,647 |
Or better yet, they fix it so the Estimated completion times equal the Target CPU Runtime. Or at the very least make the initial Estimate so it is higher than the default Target CPU Runtime, and gradually reduces downwards closer the the actual time instead of having to increase up from an extremely low initial Estimate. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1681 Credit: 17,854,150 RAC: 22,647 |
And to make things even worse- as my existing Tasks have completed, while most are done within a few minutes of the Target CPU time, others do finish sooner. And this results in the Estimated times lowering- by about 2 minutes. However for the Tasks yet to be processed by the new application the Estimated times have dropped from 4hrs 30min down to 52min 30sec. This just shouldn't occur. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1681 Credit: 17,854,150 RAC: 22,647 |
However for the Tasks yet to be processed by the new application the Estimated times have dropped from 4hrs 30min down to 52min 30sec.Now up to 1hr 10min. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,228,659 RAC: 10,982 |
Or better yet, they fix it so the Estimated completion times equal the Target CPU Runtime What madness is this?! Where does this get set? A Boinc setting or a project setting? Hard-coded or user-definable somewhere? It's amazing how long Boinc has existed and something as basic as this causes a riot with every program update. And by amazing, of course I mean pathetic. |
CIA Send message Joined: 3 May 07 Posts: 100 Credit: 21,059,812 RAC: 0 |
Do you people not have constant internet connections? Why are you all running massive caches? Just set it to .5 and be done with it. There's no reason to have more the 2x your core count in WU cache waiting to be crunched. Then set the cache to 0. One in, one out. Over time the completion time will correct and you can add a cache back if you want. |
Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,386,173 RAC: 0 |
Do you people not have constant internet connections? Why are you all running massive caches? Just set it to .5 and be done with it. There's no reason to have more the 2x your core count in WU cache waiting to be crunched. Yup, My cache was set to 0.1+ 0 and I still managed to almost get swamped on Ralph (24 hour target run-time, 3 threads set for Ralph, and it downloaded 9 tasks, there is a high chance 3 tasks will barely miss the deadline if I don't intervene). That happened because the estimated run-times were 47:39 (HOW?!). It's not much better on Rosetta (I think it was 57 minutes?) Having a 0 cache seems to be the only band-aid solution at this point... |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1681 Credit: 17,854,150 RAC: 22,647 |
It is set by the BOINC server software, based on data supplied by the project- which in this this case Rosetta.Or better yet, they fix it so the Estimated completion times equal the Target CPU Runtime Grant Darwin NT |
Message boards :
Number crunching :
Initial Estimated completion time problem.
©2024 University of Washington
https://www.bakerlab.org