Message boards : Number crunching : Internet traffic and necessary data
Previous · 1 · 2 · 3 · 4
Author | Message |
---|---|
Carlos_Pfitzner Send message Joined: 22 Dec 05 Posts: 71 Credit: 138,867 RAC: 0 |
or do as SETI Beta are doing, and re-write the app so that it uses a basic method by default, but uses SSE when it detects that the processor is capable of handling that instruction set, best of both world then, because i'm sure you'd get a lot of people complaining that rosetta no longer runs on their older computers, A simple code to add into rosetta, to verify wich instruction set each cpu can so u can be sure that app will no crash when executing SSE U will need to adapt it ... eg: If cpu sse capable goto sse_crunch routine Else goto non_sse_crunch routine and may be u will need to have a sse_crunch and a sse2_crunch routines ... *sure, no pc will get any app crash -:) // // chkcpu.c // // Check cpu extensions for Intel compatible cpu // // Tetsuji "Maverick" Rai #include <stdio.h> main(){ unsigned long _ok_cpuid, _ecx, _edx, _init_flags, _mod_flags; __asm__ (" pushf; pop %%eax; mov %%eax, %0; xor $0x200000, %%eax; push %%eax; popf; pushf; pop %%ebx; mov %%ebx, %1;" : "=m"(_init_flags), "=m"(_mod_flags) ); printf("init flag = %08x modified flags = %08xn", _init_flags, _mod_flags); if (!((_init_flags ^ _mod_flags) & 0x200000)) { printf("cpuid isn't availablen"); return 1; } printf("nok cpuid is availablen"); __asm__ (" xor %%eax,%%eax; inc %%eax; cpuid; mov %%ecx,%0; mov %%edx,%1;" : "=m"(_ecx), "=m"(_edx) ); if (_edx & 0x8000){ printf("cmov : Yesn"); }else{ printf("cmov : Non"); } if (_edx & 0x02000000){ printf("sse : Yesn"); }else{ printf("sse : Non"); } if (_edx & 0x04000000){ printf("sse2 : Yesn"); }else{ printf("sse2 : Non"); } if (_ecx & 0x1){ printf("sse3 : Yesn"); }else{ printf("sse3 : Non"); } return 0; } Click signature for global team stats |
Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0 |
great stuff, prehaps suggest it on the boinc dev mailing list if it achieves consistently greater compression ratios, it'll help everyone :)different compression methods will only be adopted if they work across all platforms, if they don't, then they're not appropriate for BOINC useWell, that's why I suggested bzip2 as it's an open-source, plug-in replacement for gzip. Works in all platforms. Agreed, but as you correct said more processing power, NOT necessarily more HOSTS. Have a look at CPU statstrue, but if you compare processing rate (using something like TeraFLOPS) against number of hosts, you'll get a positive correlation (more hosts = more processing) Personally, I'd be happy with offering a beta-SSE-enabled Rosetta executable, as optional install, like many people install optimised BOINC app.now that's an idea, but obviously to get the most benifit for the cost then you might as well deploy an app that will do it automatically, that's the best cost:benifit ratio, but as a half-way thing then yea, a seperate app would probably help, but you'd need quite a lot of people using the optimised version to notice an improvement |
Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0 |
A simple code to add into rosetta, to verify wich instruction set each cpu cani'm no programmer, so forgive the newbie question i understand the instruction set selection method, but how hard would it be to have different versions of the routines in the same app, would there be a lot of work involved or is it quite simple? |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,675,695 RAC: 11,002 |
Can anyone tell me roughly how much bandwidth rosetta will use, post installation, on a Sempron 2600+ machine that is on ~3hrs a day, running XP? I can probably add this machine, but it's on capped broadband so the bandwidth is all important. cheers Danny |
SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0 |
Can anyone tell me roughly how much bandwidth rosetta will use, post installation, on a Sempron 2600+ machine that is on ~3hrs a day, running XP? I can probably add this machine, but it's on capped broadband so the bandwidth is all important. About 10Mb per day |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,675,695 RAC: 11,002 |
Can anyone tell me roughly how much bandwidth rosetta will use, post installation, on a Sempron 2600+ machine that is on ~3hrs a day, running XP? I can probably add this machine, but it's on capped broadband so the bandwidth is all important. Cheers. Unfortunately, I think it'd need to be less than 100MB/month to be viable on that machine. |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
Keep an eye on the boards - as later on this week we'll be given a new app that will give us the ability to select how many hours of jobs to download for each project downloaded. So, for a 24 hour run (on always on systems), we can supposedly download 24 hours of work on one project; instead of 8ea 3 hour projects or 48ea 30 minute projects. If this gives the option of up to a week's worth of work per project - then it'll be very easy to get the bandwidth usage below your target. With better compression added to the client, it'll be possible to reduce the bandwidth usage to 50% to 33% of the current project downloads. So there's still room for improvement. Reducing bandwidth usage to 1 eighth or 1/48th (depending on the type of projects being handed out at the time) just by switching to 24 hours (the default will be 8) of jobs per project (for always on 24/7 machines) will be a tremendous reduction for those with usage caps. |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,675,695 RAC: 11,002 |
Keep an eye on the boards - as later on this week we'll be given a new app that will give us the ability to select how many hours of jobs to download for each project downloaded. Yeah - we've ordered the parts for the machine so it'll be easiest to install it when i build it, but I can ask him to install at a later date if the bandwidth requirements can be controlled to a suitable level for him. |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
File compression may soon be offered through Boinc. See below email from Dr. Anderson: Email from Dr. Anderson: David Anderson to boinc_projects, boinc_dev More options 4:01 pm (6 minutes ago) Libcurl has the ability to handle HTTP replies that are compressed using the 'deflate' and 'gzip' encoding types. Previously the BOINC client didn't enable this feature, but starting with the next version of the client (5.4) it does. This means that BOINC projects will be able to reduce network bandwidth to data servers (and possibly server disk space) by using HTTP compression, without mucking around with applications. This is described here: http://boinc.berkeley.edu/files.php#compression -- David Interesting. |
Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0 |
Libcurl has the ability to handle HTTP replies I am not familiar with 'deflate' but since the the Rosetta files already are gzipped, gzipping them a second time wouldn't have any additonal benefit. ;-) |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
ut oh, this just in from Bruce Allen at Einstein: Bruce Allen <xxxxxx@gravity.phys.uwm.edu>to David, boinc_projects, boinc_dev More options 5:17 pm (1 hour ago) David, some project (including E@H) are already sending/returning files which are 'zipped'. We need to make sure that the cgi file_upload_handler program does not automatically uncompress files unless this has been requested specifically by the project. Cheers, Then later, Dr. A came out with: [boinc_alpha] compression bug in 5.3.21 Inbox David Anderson to boinc_alpha More options 6:24 pm (25 minutes ago) We quickly found that the support for gzip compression breaks Einstein@home and CPDN, which do their own compression. We're fixing this and it will be in 5.3.22. -- David Point here is that if Rosetta uses this compression, users shouldn't just jump for the latest dev client until testers have worked out the bugs. I am a boinc alpha tester and will soon find out if this is a problem. LOL I still have 3 4.81s' before I get on with the 4.82's |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
Ask them what level of compression they're using for these transfers.. since Gzip allows you to specify a range of compression abilities ranging from Fast to Best. For large files, one hopes they're using the highest compression possible. `--fast' from http://www.math.utah.edu/docs/info/gzip_4.html#SEC7 For that matter, is Rosetta using -9/--best with the zlib compression it currently uses? |
alvin Send message Joined: 19 Jul 15 Posts: 5 Credit: 6,550,555 RAC: 0 |
I have currently running this project and its all fine except one thing download data amount here is monthly report address download upload total bakerlab.org 24.0 GB (5.4 %) 6.00 GB (6.7 %) 30.0 GB (5.6 %) It's strange as I have opposite issue with other projects - they have huge ratio for download:upload as 1:5 or more. The issue is amount of traffic : could I ask you to pack results on client side if possible? Could compressing data be an option in settings? I suppose all those years ages ago noone cares about those amounts, but why the difference disbalance between incoming data and outcoming data is so huge? Anyway I think some action either on project side or whole boinc side could be done to pursue the balance and minimise traffic. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,249,734 RAC: 8,235 |
I have currently running this project and its all fine except one thing I think all tasks are already packed. I notice you keep a 7 day buffer, which is much bigger than necessary. I get away with just 2 days quite comfortably. It's rare to need anything more. But your biggest problem is that you use only a 1 hour runtime, for each of your 32 PCs! 500 or more tasks each makes 16,000! There's your problem! First, cut your buffer down to 2 days in BOINC under Computing Preferences - or whatever you're comfortable with. Leave it for 5 days to let your buffers run down, then go online and increase your run time for each machine. But do this slowly otherwise tasks will miss their deadline. So just increase from 1 hour to 2 hours at first and leave it a few days again before increasing to 4 hours. Reducing your buffers will mean you'll only have uploads for 5 days, no downloads at all. Doubling your runtime to 2 hrs will halve your previous volume of downloads while (I think) not changing the upload size (or by very little if it does increase). Doubling to 4hrs will halve downloads again. It's up to you if you decide to increase to 6hr runtimes, which is the default. If you do, your downloads will reduce again in proportion. |
alvin Send message Joined: 19 Jul 15 Posts: 5 Credit: 6,550,555 RAC: 0 |
thanks man so is CPU running time is equivalent of portion of tasks executed? so does it slice tasks to portions therefore then I have more runtime it just crunches one task or portions of it longer? Am I correct? ------ those 7 or 10 days for tasks appeared after my fight to get tasks for projects then they claim "not a priority project" etc lets see |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
Basically, to properly query the energy space of a structure, many decoys of said structure need to be simulated - a longer runtime means it will simulate more decoys before reporting work to/pestering for work from the server. It's... a) more efficient on your end as there's less time spent doing disk I/O switching between models b) more efficient for the project servers as they can bulk load in results/create bigger bulk work faster c) will use less bandwidth as more resources are shared between decoy runs as the target models don't change as frequently |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The bandwidth will not vary one-to-one with runtime, because sometimes you get several tasks that use the same underlying database. And in that sense, having a large number of tasks improves your odds of already having a similar task on deck somewhere. But I agree with the the suggestions (and actually just responded with the same in response to IMs from Costa). With that many machines, night and day difference in download bandwidth will be achieved using a cacheing proxy server. This will afford the same effect as described above, where already having another task from the same batch of work will avoid a large DB download, but now leverage that across all of the hosts using the proxy, rather than just within a single host. The project also changed application levels recently, and so without a cacheing proxy, each host had to download it's own copy of the new executables and libraries. Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,249,734 RAC: 8,235 |
thanks man What the others said is right. I looked at one of your tasks on one machine and it reported it completed 5 "decoys" in 1 hour. If you increased to 2 hours it would run 10 and you get double the credit. I think the maximum allowed in a task is 99. The number of "decoys" varies a lot, but obviously the default 6 hour runs manage it fine. those 7 or 10 days for tasks appeared after my fight to get tasks for projects then they claim "not a priority project" etc I think you run a lot of projects. When you do, BOINC goes a bit weird. Increasing the number of days makes things worse, so I read, so cutting down to 2 days (or less) will help. I think the default is actually 0.25 days so you could cut it down even more if you like. If ever tasks dry up on Rosetta (happens only once every 6 months or so) you have plenty of other project tasks to take up the slack. It's all a learning curve. No harm done. Thanks for committing so many machines to Rosetta! |
alvin Send message Joined: 19 Jul 15 Posts: 5 Credit: 6,550,555 RAC: 0 |
I've started with LHC but then it goes out of tasks I got to have something esle and then it all builds up. Also some GPU-based projects' tasks are longer than 5 days alone, so this is a huge mess and mix in that. Finally boinc really plays up with different combinations of projects not getting tasks even if they are available because of whatever (berkeley support wasn't been helpful in that) so my goal is to have crunchers performing instead of idling so lets see. |
Message boards :
Number crunching :
Internet traffic and necessary data
©2024 University of Washington
https://www.bakerlab.org