Bandwidth Usage / Compression

Message boards : Number crunching : Bandwidth Usage / Compression

To post messages, you must log in.

AuthorMessage
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,649,459
RAC: 0
Message 77635 - Posted: 9 Nov 2014, 21:15:48 UTC
Last modified: 9 Nov 2014, 21:18:26 UTC

Disclaimer - I understand as someone in software development that any changes to the codebase represent a big lift/commitment of effort - I simply wanted to point out some potentially low-hanging fruit for optimizing the network load of the Rosetta@Home cluster.

For fun, I decided to test out the effects of re-compressing some of the larger files that Rosetta transmits across the net.

The minirosetta database file that is sent to new clients is 212MB (at least, the version I received when my newest PC attached to the project was), but could be shrunk down by at least another 5MB simply by increasing the compression setting of the GZIP library to it's highest DEFLATE compression level. Alternatively, switching to the open source LZMA2 (7z) library could save at least 27MB of bandwidth.

(Smaller is better)



Some of the smaller job files also compress MUCH more efficiently using LZMA2 (for example this one file would go from the current 12MB to just over 2MB):

(Smaller is better)



After noticing that having R@H running increased my monthly bandwidth usage significantly I did some digging around and found that the GZIP settings employed by R@H leave alot of bandwidth savings on the table. Slight adjustments or even switching to LZMA2 could mean major bandwidth savings at the scale of the current BOINC cluster, and less issues handling peak load like what happened a couple months ago when Charity Engine suddenly added an armada of new clients.
ID: 77635 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 77637 - Posted: 10 Nov 2014, 17:30:08 UTC

thanks for the quick tests! I'll look into this for the next application update. It may take some time because our rosetta source has gone through some significant refactoring recently so I'm expecting a lot of platform dependent debugging and testing etc... for the next app update.
ID: 77637 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,573,506
RAC: 7,165
Message 77638 - Posted: 10 Nov 2014, 22:25:59 UTC - in response to Message 77637.  

It may take some time because our rosetta source has gone through some significant refactoring recently so I'm expecting a lot of platform dependent debugging and testing etc...


Recently? The current app (3.52) was released 28/05....

ID: 77638 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,649,459
RAC: 0
Message 77642 - Posted: 11 Nov 2014, 15:06:51 UTC - in response to Message 77638.  


Recently? The current app (3.52) was released 28/05....


Why so impatient? In the scientific app. development world that IS incredibly recent. Also, just because the last public release was in May doesn't mean anything about other branches of the app like beta versions etc. The code base of any project is a constantly changing beast and we likely only see the post-test/debugged versions (hopefully =P). I am thrilled if this suggestion ever gets implemented even if its not for many months.
ID: 77642 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 77644 - Posted: 12 Nov 2014, 0:44:11 UTC - in response to Message 77638.  

It may take some time because our rosetta source has gone through some significant refactoring recently so I'm expecting a lot of platform dependent debugging and testing etc...


Recently? The current app (3.52) was released 28/05....


I believe DK was talking about recent changes that need to be incorporated and rolled in to the R@h working code level.
Rosetta Moderator: Mod.Sense
ID: 77644 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Bandwidth Usage / Compression



©2024 University of Washington
https://www.bakerlab.org