Message boards : Number crunching : Short Deadlines
Author | Message |
---|---|
James W Send message Joined: 25 Nov 12 Posts: 130 Credit: 1,766,254 RAC: 0 |
Is there a reason the WU starting with rb_05_26 (rb_month_day) have such short turnaround times (2-3 days)? I have to keep an eye on this so as to not overwork my system. |
dcdc Send message Joined: 3 Nov 05 Posts: 1831 Credit: 119,554,486 RAC: 7,436 |
Is there a reason the WU starting with rb_05_26 (rb_month_day) have such short turnaround times (2-3 days)? I have to keep an eye on this so as to not overwork my system. CASP has started, so they're pumping jobs through as quickly as possible: We are still in the server stage of CASP and so jobs are run through the robetta platform which gives the jobs a rb_* prefix. The CASP jobs have been running with this prefix. from here: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6822 D |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
...the good news is that it is not a large number of tasks pulling the short deadlines. Otherwise, systems with large work caches can make another request for work and get a pile that all insert themselves in front of the others, and start running at "high priority". Seems to go OK so long as there are only a small fraction with short deadlines, and the cache is not more than a few days. Rosetta Moderator: Mod.Sense |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Each CASP target released over the coming months will have a very quick (3 days I think it is) deadline for submissions from "server" predictions, and then a longer period for "human" predictions. Some human predictions are aided by server predictions too. Rosetta Moderator: Mod.Sense |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
I don't mind short deadlines. In fact, I totally can't justify making a researcher wait many many days to start getting an answer to a query. How the hell can someone be expected to iterate effectively when the answers come back at such a slow pace?! I work with databases and crunching of large datasets at my job, and I get mildly annoyed when my queries at work take more than a couple of hours because it makes iterating through questions very painful. I surely don't want to be the person making a query take a whole week. XD |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I don't mind short deadlines. In fact, I totally can't justify making a researcher wait many many days to start getting an answer to a query. How the hell can someone be expected to iterate effectively when the answers come back at such a slow pace?! The champion is Climate Prediction Network. They give you a year (no kidding). I, among others, have suggested that is too long. But they are resistant to change. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
I don't mind short deadlines. In fact, I totally can't justify making a researcher wait many many days to start getting an answer to a query. How the hell can someone be expected to iterate effectively when the answers come back at such a slow pace?! They're planning for 2 days, so 2 days is what they set. If they wanted a result in 1 day, that's what they'd set. It's only a problem if they fail to get results back by the deadline. They need to bear in mind that task runs are defaulting to 6 hours too. For those of us that adjust our task buffer, we should ensure we keep less than 2 days in hand during CASP so Boinc doesn't mess up on scheduling. For example, to account for runtime variation, I've dropped mine to a 1.5 day buffer. |
dcdc Send message Joined: 3 Nov 05 Posts: 1831 Credit: 119,554,486 RAC: 7,436 |
Is there any benefit to using the "report results immediately" setting in config.cc given the desire for a quick turn around? |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Is there any benefit to using the "report results immediately" setting in config.cc given the desire for a quick turn around? No, this only adds unnecessary load to the servers. . |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
For those of us that adjust our task buffer, we should ensure we keep less than 2 days in hand during CASP so Boinc doesn't mess up on scheduling. For example, to account for runtime variation, I've dropped mine to a 1.5 day buffer. On this subject, I returned from my usual 3-4 days away to discover I'd had one of those "Rosetta Mini for Android is not available for your type of computer" messages which results in a 24hr delay in the next update attempt. I just got home in time to force a manual update and beat the task deadlines by a couple of hours when the next update was still several hours away. 20 tasks got uploaded. What was decided on this? Is it a Boinc scheduling issue or something from the Rosetta servers that can be corrected? This is something that's more of a concern for people who don't monitor task progress. Where the default buffer is just 0.25 days there's a risk of running taskless for 18 hours - unless another project is available. Either way, it's not good for Rosetta or CASP. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
What was decided on this? Is it a Boinc scheduling issue or something from the Rosetta servers that can be corrected? This is something that's more of a concern for people who don't monitor task progress. Where the default buffer is just 0.25 days there's a risk of running taskless for 18 hours - unless another project is available. Either way, it's not good for Rosetta or CASP. The servers are so busy delivering work there are brief periods where the servers do not have work units ready to deliver. Since these periods are so brief before the active tasks preparing work have more ready, there is a race condition for new work. If you machine happens to catch a few in a row when no prepared work is ready, the BOINC Manager backoffs double and quickly reach 24hrs. Rosetta Moderator: Mod.Sense |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
If you machine happens to catch a few in a row when no prepared work is ready, the BOINC Manager backoffs double and quickly reach 24hrs. This is not the "normal" BOINC backoff, it's 24 hours after the first failed request. It would be good if people getting this could post their sched_reply_boinc.bakerlab.org_rosetta.xml (maybe without <email_hash> and <cross_project_id>). Most interesting would be <request_delay> for the start. . |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
If you machine happens to catch a few in a row when no prepared work is ready, the BOINC Manager backoffs double and quickly reach 24hrs. This is my complete file, minus the two fields you mentioned. Request delay shows as 242.4 which looks to be the 4 minutes or so that shows whenever I normally return results. I agree, the 24hr back-off does appear to be after the first failed request. <scheduler_reply> |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
@Sid Celery: is that a reply, which caused the 24 hour delay? Probably not as far as I can see... because that's the one that would be interesting, not any other one (yes, I should have write that more clear). . |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
@Sid Celery: is that a reply, which caused the 24 hour delay? Probably not as far as I can see... because that's the one that would be interesting, not any other one (yes, I should have write that more clear). Oh. So does it change for each upload? If so, I'll try to seek it out next time it happens (if it happens again). I may still be misunderstanding your question - I'm not good at this. Fwiw I'm adding one or two cores to processing tasks on my smartphone just to use up a few extra of those available tasks (as if it'll make any noticeable difference) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
@Sid Celery: is that a reply, which caused the 24 hour delay? Probably not as far as I can see... because that's the one that would be interesting, not any other one (yes, I should have write that more clear). Ok, it just happened again, and as you suspected, request_delay now shows 86400 = 24hrs: <scheduler_version>605</scheduler_version> |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Well... than the project admins now know what needs to be fixed. . |
Message boards :
Number crunching :
Short Deadlines
©2024 University of Washington
https://www.bakerlab.org