[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patches] Possible PowerPC LIBC optimization



Mark Mitchell wrote:
> Steven Munroe wrote:
>
>   
>> The other end of the spectrum is to optimizing the entire library for a
>> specific platform and use the dynamic linker dl_procinfo to select from
>> multiple cpu-tuned libraries.
>> <http://sources.redhat.com/ml/libc-alpha/2006-01/msg00094.html>.
>>     
>
> Is it practical to have an intermediate state where most of the library
> is generic, but high-performance routines (e.g., strcpy) are in a
> separate libcpu.so that's chosen dynamically, either by the dynamic
> linker, or by the system administrator setting symlinks at system
> installation time?  Would having strcpy in a separate libcpu.so impact
> performance negatively because there are now more dynamic libraries in
> play?
>
> I'm just trying to think about how to eliminate the cost (in terms of
> build time, but, more importantly, validation) that comes with
> multilibs.  I'm not at all confident that I'm going in a useful
> direction; just poking about to see if there's anything down this path. :-)
>   
Seems like the test/verification problem is the same reguardless. In
each case the instructions executed are different depending on the chip
you are running on.

Also the libcpu.so idea adds a level of PLT call stubs that would not be
there for internal libc usage. This can swamp any gain from the
optimization. (PLT call stubs are a dependent sequence on powerpc).

Also in some cases the libc functions have internal _libc_* symbols in
addition to the POSIX defined symbol. In this case the override in
libcpu.so would not be effective and libc would continue to use the
unoptimized version of the function. This is also an issue with libm.so.

These are all factors that must be considered. And the relative costs
will vary from platform to platform (chip to chip). It also has to be
consider in the larger context. The glibc code base is shared by many
different platforms. I personally have not commit optimizations for
powerpc32 to glibc cvs because I know that while they would help POWER
servers they would cause negative performance on powerpc chips used in
Desktop and embedded applications (10 pipelines vs 3). The powerpc-cpu
add-on and the cpu-tuned libraries (search based on AT_PLATFORM) was the
only way I could think of to get the performance I know was there,
without harming the larger community.