Some minirosetta 3.65 perf data

Message boards : Number crunching : Some minirosetta 3.65 perf data

To post messages, you must log in.

AuthorMessage
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 23,054,272
RAC: 8,196
Message 78913 - Posted: 15 Oct 2015, 2:29:05 UTC

The 64-bit Linux app is dynamically linked. The first time I have seen that. Dynamic linking causes the libglut type library problems BUT it also allows the Linux system to select the optimized libraries. On my Broadwell system, Rosetta uses math libraries (libm.so) that have an AVX optimized that it takes.

Rosetta is still scalar code so Rosetta is (by necessity or by over sight) using 1/2 of the SSE registers.

The main binary is still built with the standard scripts that 'strip" symbols out so it is tough to drill down into the code hot spots.



2.73% libc-2.21.so [.] _int_malloc
2.40% libc-2.21.so [.] free
1.86% libc-2.21.so [.] malloc_consolidate
1.83% libc-2.21.so [.] malloc
1.56% libstdc++.so.6.0.21 [.] std::_Rb_tree_increment
1.07% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x00000000036cca02
0.97% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x000000000019df08
0.97% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002b435e2
0.88% libm-2.21.so [.] __ieee754_log_avx
0.87% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x000000000000feb2
0.86% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x00000000036ce047
0.83% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002b58450
0.77% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x000000000312f1b9
0.76% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002bf055d
0.73% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002bf0231
0.68% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002bf022d
0.65% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002b5845e
0.61% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002bf0e66
0.61% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002b55f05
0.60% libm-2.21.so [.] __ieee754_exp_avx
0.59% libm-2.21.so [.] __sin_avx
0.58% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x0000000002b56004
0.56% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x000000000312b489
0.56% minirosetta_3.65_x86_64-pc-linux-gnu [.] 0x000000000367b144
ID: 78913 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,627,225
RAC: 11,586
Message 78916 - Posted: 15 Oct 2015, 20:40:57 UTC

Do you think that is likely to be a low hanging fruit for getting some easy and reliable speed gains?
ID: 78916 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 23,054,272
RAC: 8,196
Message 78919 - Posted: 16 Oct 2015, 3:58:44 UTC - in response to Message 78916.  

Do you think that is likely to be a low hanging fruit for getting some easy and reliable speed gains?


I suspect so. I am glad to see the dynamic linking with libm and don't view the missing GL library as a big problem. I suspect that looking at why the code does not vectorize is the change that could make most impact and stay within the SSEx envelop.

I have never seen "malloc_consolidate" come to the top. From what I have read so far, it seems to be trying to combine adjacent memory blocks that were freed. At first glance, it appears that the code was designed to avoid static buffers malloc/free .... trying to keep the total amount of memory used down. Seems like an attempt at garbage collection.

1. any recovery of time from those functions will allow the others to work. If you remove 2%, the remaining 98% now has 100% of the time. I would suspect ~5% if the memory management could be thought through.

2. If you free and then allocate memory, there is a good chance that the physical memory locations will not be cached in that CPU cache. You also generate a string of read/write misses.

There does not seem to be any hot spot that would make much difference .... but it is tough to know when much of the binary has symbols stripped. The only reason I can see libc is because I loaded the debug info.

ID: 78919 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,623,704
RAC: 9,591
Message 78921 - Posted: 16 Oct 2015, 10:01:07 UTC - in response to Message 78913.  

Rosetta is still scalar code so Rosetta is (by necessity or by over sight) using 1/2 of the SSE registers.
The main binary is still built with the standard scripts that 'strip" symbols out so it is tough to drill down into the code hot spots.


David says:
I've been too busy to look into optimizations. We do have one volunteer helping us out however.

Are you the volunteer?

ID: 78921 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 23,054,272
RAC: 8,196
Message 78923 - Posted: 16 Oct 2015, 13:11:02 UTC - in response to Message 78921.  

Rosetta is still scalar code so Rosetta is (by necessity or by over sight) using 1/2 of the SSE registers.
The main binary is still built with the standard scripts that 'strip" symbols out so it is tough to drill down into the code hot spots.


David says:
I've been too busy to look into optimizations. We do have one volunteer helping us out however.

Are you the volunteer?


I have not been contacted, but I did volunteer on the board and in a private message. I answer questions and am happy to consult that way too. I am still poking around looking for a project to adopt me. 8-)

The execution profile is "flat" (not much execution time in one function) which means there is probably nothing trivial to "fix" but it is hard to know without source OR a binary with symbols/debug records.


ID: 78923 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,623,704
RAC: 9,591
Message 78947 - Posted: 19 Oct 2015, 13:51:57 UTC - in response to Message 78923.  

I have not been contacted, but I did volunteer on the board and in a private message. I answer questions and am happy to consult that way too. I am still poking around looking for a project to adopt me. 8-)


Uh, there are a lot of projects out there that need a good developer...

ID: 78947 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 23,054,272
RAC: 8,196
Message 78952 - Posted: 20 Oct 2015, 12:15:24 UTC - in response to Message 78947.  

I have not been contacted, but I did volunteer on the board and in a private message. I answer questions and am happy to consult that way too. I am still poking around looking for a project to adopt me. 8-)


Uh, there are a lot of projects out there that need a good developer...


A "lot of projects out there NEED a good developer".
Far fewer projects "WANT a good developer" to look over their shoulder. 8-) That is just human nature.

I am more a prototype/breadboard/algorithm engineer. I encourage developers who get my algorithms/code to use it as a model and rewrite it in their own program vernacular. It is somewhat embarrassing that most just leave the code untouched. Ugh! My junk gets documented forever.

A current example project of mine would be converting AVX2 algorithms to use AVX512 instructions. For Rosetta, the first thing I would probably examine would be the reasons stopping vectorizing (probably a 30%-40% difference).

I think I am going to build a Virtualbox and use the isolated environment and start with the BOINC client/manager source to understand the underlying system.

I would only work under the supervision/direction of a project developer AND only for as long as they have interest.



ID: 78952 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,627,225
RAC: 11,586
Message 78954 - Posted: 20 Oct 2015, 20:15:06 UTC - in response to Message 78952.  

I have not been contacted, but I did volunteer on the board and in a private message. I answer questions and am happy to consult that way too. I am still poking around looking for a project to adopt me. 8-)


Uh, there are a lot of projects out there that need a good developer...


A "lot of projects out there NEED a good developer".
Far fewer projects "WANT a good developer" to look over their shoulder. 8-) That is just human nature.

I am more a prototype/breadboard/algorithm engineer. I encourage developers who get my algorithms/code to use it as a model and rewrite it in their own program vernacular. It is somewhat embarrassing that most just leave the code untouched. Ugh! My junk gets documented forever.

A current example project of mine would be converting AVX2 algorithms to use AVX512 instructions. For Rosetta, the first thing I would probably examine would be the reasons stopping vectorizing (probably a 30%-40% difference).

I think I am going to build a Virtualbox and use the isolated environment and start with the BOINC client/manager source to understand the underlying system.

I would only work under the supervision/direction of a project developer AND only for as long as they have interest.



Hopefully someone will be in touch!
ID: 78954 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,623,704
RAC: 9,591
Message 78960 - Posted: 21 Oct 2015, 9:28:45 UTC - in response to Message 78952.  

A "lot of projects out there NEED a good developer".
Far fewer projects "WANT a good developer" to look over their shoulder. 8-) That is just human nature.


I'm not so pessimist.
A lot of projects publish their source codes for free. Denis, Poem, CSG....
ID: 78960 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 78963 - Posted: 21 Oct 2015, 16:33:23 UTC - in response to Message 78960.  

A "lot of projects out there NEED a good developer".
Far fewer projects "WANT a good developer" to look over their shoulder. 8-) That is just human nature.


I'm not so pessimist.
A lot of projects publish their source codes for free. Denis, Poem, CSG....


Rosetta is a bit different. Their code is free-to-use-sort-of depending on whether you are doing for profit or for academic purposes.
rjs5 should be able to look at the code freely if he asked he isn't going for profits.
ID: 78963 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1994
Credit: 9,623,704
RAC: 9,591
Message 78970 - Posted: 22 Oct 2015, 8:19:07 UTC - in response to Message 78963.  

Rosetta is a bit different. Their code is free-to-use-sort-of depending on whether you are doing for profit or for academic purposes.


I know the "policy" about rosetta sw.
I show some projects with open sw licence, indeed.
I don't know if rosetta's admins want to be helped and how and how much

ID: 78970 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 78972 - Posted: 22 Oct 2015, 20:32:20 UTC - in response to Message 78970.  

Rosetta is a bit different. Their code is free-to-use-sort-of depending on whether you are doing for profit or for academic purposes.


I know the "policy" about rosetta sw.
I show some projects with open sw licence, indeed.
I don't know if rosetta's admins want to be helped and how and how much


Yeah... that's the thing. In addition, R@H seems to be an "extension" to the real Rosetta Code (Rosetta Commons). At least now we have a 64-bit linux binary.
ID: 78972 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Some minirosetta 3.65 perf data



©2024 University of Washington
https://www.bakerlab.org