Message boards : Number crunching : No work
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I've said this before for other reasons, but we make our machines available for whatever the project needs. We don't pay for tasks so we can't demand them. 24/7/365 availability of tasks has never been guaranteed. If the project doesn't utilise that resource, that's up to them. If any of us want to have our machines utilised 247 we're at liberty to hold a backup project. Exactly so. You can of course never precisely match the supply of work units with the requests by the crunchers for them, and it is not the duty of the scientists to provide us a pastime. The scientists do what they have to do. We are here to assist them. |
niswes Send message Joined: 21 Jun 09 Posts: 2 Credit: 5,056,097 RAC: 454 |
statement from rosetta staff would be nice |
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
Well said. They always seem conspicuous by their absence from these boards. |
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
At the risk of appearing stupid - who can tell me the difference between these status elements? Computing status Work Tasks ready to send 18125 Tasks by application Application Unsent Rosetta 1 Rosetta Mini 0 |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,573,283 RAC: 7,160 |
statement from rosetta staff would be nice +1 |
Warped Send message Joined: 15 Jan 06 Posts: 48 Credit: 1,788,185 RAC: 0 |
statement from rosetta staff would be nice +2 |
VO Send message Joined: 4 Nov 05 Posts: 7 Credit: 3,250,754 RAC: 0 |
linux only i think |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
Back again, apparently affecting all types of machines. The server status page shows very few unsent units (with the requisite scrolling). I still think I saw sufficient evidence the other day to suggest there was something different going on among the different OS/browser combinations. #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I run the 24-hour work units, and haven't gotten any work for a bit longer than that. So I am all on my backup project, GPUGrid - Quantum Chemistry, which is relatively new, but Linux only, and runs multi-core. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,573,283 RAC: 7,160 |
Before or later the queue will restart. I hope. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,573,283 RAC: 7,160 |
I've said this before for other reasons, but we make our machines available for whatever the project needs. We don't pay for tasks so we can't demand them. 24/7/365 availability of tasks has never been guaranteed. If the project doesn't utilise that resource, that's up to them. I agree. But if admins write two lines to explain the situation (for example: "hey, guys, we have problems with scheduler")... |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I agree. But if admins write two lines to explain the situation (for example: "hey, guys, we have problems with scheduler")... That would be useful for our planning purposes. A temporarily glitch is different than a long-term shortage, and we could make arrangements accordingly. |
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
I agree. But if admins write two lines to explain the situation (for example: "hey, guys, we have problems with scheduler")... True dat |
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
Looks like we're back running ... wonder how long until next "blockage" |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
I've said this before for other reasons, but we make our machines available for whatever the project needs. We don't pay for tasks so we can't demand them. 24/7/365 availability of tasks has never been guaranteed. If the project doesn't utilise that resource, that's up to them. I'm trying to think what the next few words would be after "..." and I can only really come up with "it wouldn't make the slightest difference to anything" My current issue is now to manage down the tasks from my back-up project to make space for Rosetta tasks again |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
Just stopped by to see if there was any explanation of the recent outages or for the increasing problem with "computation errors" that terminate long-running tasks... Used to be the computation errors usually happened within a few minutes of starting, but I just saw another as the task approached 8 hours. As usual, I was unable to find much substantive information in these forums, but perhaps that is mostly a visibility-and-search problem for the information that might exist somewhere on the website. Perhaps I have actually come to prefer the "We don't care, so you shouldn't worry either" attitude of this project? It would be nice to know if I get any credit at all for 8 hours of computation that ends with a "computation error" and it would be nice to know if the computation errors were related to particular hardware or OSes, but if they don't care, why should I? I guess from a BOINC-level perspective the solution is to run several projects. I've actually run a number of them over the years, but most of them were more or less problematic, so that approach doesn't much appeal to me. #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Perhaps I have actually come to prefer the "We don't care, so you shouldn't worry either" attitude of this project? It would be nice to know if I get any credit at all for 8 hours of computation that ends with a "computation error" and it would be nice to know if the computation errors were related to particular hardware or OSes, but if they don't care, why should I? Let's just say that they don't find communicating with users to be an efficient use of their time. They might be right. If you want trouble-free, there is really only World Community Grid. I run a lot of others too of course, but set my expectations accordingly. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
Perhaps I have actually come to prefer the "We don't care, so you shouldn't worry either" attitude of this project? It would be nice to know if I get any credit at all for 8 hours of computation that ends with a "computation error" and it would be nice to know if the computation errors were related to particular hardware or OSes, but if they don't care, why should I? I would've said the same, except in the recent period where I've run a lot of WCG tasks I've come up with 6 errors, all from one sub-project (MIP), however all of which were validly completed by the user who was reissued with them. This on my new Intel i3-8350K desktop and not at all on my AMD FX8370 which itself has occasional issues with Rosetta 4.07 tasks (but not mini Rosetta 3.78 tasks). However, both are overclocked so maybe those particularly tasks are making specific individual demands that find the cracks on the outer extremes of my machines or during crashes or power losses etc. I have a flaky laptop that has occasional errors too, but my non-overclocked, non-flaky devices produce none. That's a pretty big clue as to where my issues originate and explains why I don't begin by blaming something else for my own self-inflicted problems. As such, demanding to find a cause at the project end seems to be a futile exercise, when it's just as likely (if not moreso) that it's caused at the user end. So then it's just as legitimate a question for shanen to ask himself what's happening at his end that might explain his computation errors. Do those machines survive a stress test for example. That would be my first port of call before repeatedly blaming somewhere else. |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
WCG is one of the projects I ran pretty heavily. I've concluded that I feel less forgiving towards them because IBM is (or was?) supporting the umbrella of WCG for other projects. One of the many problems that drove me away from WCG was confusing inconsistencies and problems among the projects, perhaps like the next poster noted. Having said that, I actually stopped by today to warn people about the DRH project, and yet as I type this one I see another computation error from a d9244 project... At least it was an early failure. However I think the DRH warning calls for a fresh thread. #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
WCG is one of the projects I ran pretty heavily. I've concluded that I feel less forgiving towards them because IBM is (or was?) supporting the umbrella of WCG for other projects. One of the many problems that drove me away from WCG was confusing inconsistencies and problems among the projects, perhaps like the next poster noted. What were the results of the stress tests you ran on your own machines? What did you run? For how long? Were there no errors at all? Or there were errors? How have you gone about fixing or mitigating your issues? Or you haven't run stress tests and you're blaming everyone and everything else first? Because out of the 4 million people running Boinc projects, no-one else is highlighting issues that you are at this time. Just trying to help. |
Message boards :
Number crunching :
No work
©2024 University of Washington
https://www.bakerlab.org