Message boards : Number crunching : Unfinished work units
Author | Message |
---|---|
Reaper13 Send message Joined: 10 Nov 05 Posts: 3 Credit: 74,001 RAC: 0 |
Why would I have something on the order of 25 work units that are in some % of completion? I have 4 that are running as "high priority" but a ton of unfinished ones? Why does Rosetta not go back and finish the other ones that were started? |
![]() Send message Joined: 3 Nov 05 Posts: 1833 Credit: 120,343,184 RAC: 28,545 ![]() |
Why would I have something on the order of 25 work units that are in some % of completion? I have 4 that are running as "high priority" but a ton of unfinished ones? Why does Rosetta not go back and finish the other ones that were started? To be pedantic, it's BOINC that controls which task is worked on- Rosetta just does what it's told. My initial suggestion was going to be that you were low on RAM so they might be waiting for memory, but I see that's not the case as you have 8GB for 4 cores, unless you're either running something particularly memory intensive or you've adjusted the BOINC memory allowance right down? If it's not to do with memory then I have no idea - have you adjusted the BOINC 'store enough work for # days' setting recently? It might have started downloading more urgent tasks when part-way through existing ones, but to get 25 partially complete is impressive... |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Right, memory is my first thought as well. Specifically, the amount of memory BOINC is allowed to use. When R@h tasks run, they grow in memory usage as a given model progresses. Then when the model completes, memory usage drops a bit and will build again as the next model progresses. If BOINC detects that more memory is being used then your preferences allow, it suspends the task to reduce the memory used by BOINC. Then it starts another task, hoping it might use less memory then the last one, and for a short time, it does use less and runs. But often, it will grow to use a similar amount of memory later in the model, and depending upon where the other 3 active tasks are in their runs, that may then again exceed the memory preference. So why not go back to the first task now rather then starting a third? ...well, I think BOINC already knows how much memory the first task is going to need (same amount it was using when it was suspended). And generally, that amount of memory would also exceed what BOINC is trying to live within. So, it knows the first task won't fit, and the second task grew too large... "hey what about this third task here?" ...and the cycle continues. This can lead to sizable swap file space used. Hopefully you are retaining suspended tasks "in memory" (virtual memory, i.e. the swap file), see preferences. But in the end, it should eventually figure things out and get those run. You may notice periods of time when only 3 CPUs are active so that it can live within the memory preference, and yet meet the deadlines of existing tasks before requesting more. If the current behavior bothers you or perhaps the large swap file is undesirable, I'd suggest either allowing BOINC a higher percentage of system memory, or consider limiting BOINC to using 3 CPUs. Another idea would be to add another project for something above 25% resource share, that uses significantly less memory to run. WCG often has projects that have small memory footprints. That way one CPU will tend to be running the low-memory project while the other three still have enough memory to happily run R@h tasks. Rosetta Moderator: Mod.Sense |
![]() Send message Joined: 3 Nov 05 Posts: 1833 Credit: 120,343,184 RAC: 28,545 ![]() |
Or add even more RAM! :D |
Message boards :
Number crunching :
Unfinished work units
©2025 University of Washington
https://www.bakerlab.org