[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [patches] powerpc 8xx dcbz problem
- To: Nathan Sidwell <nathan@xxxxxxxxxxxxxxxx>
- Subject: Re: [patches] powerpc 8xx dcbz problem
- From: Steven Munroe <munroesj@xxxxxxxxxx>
- Date: Mon, 04 Jun 2007 13:04:45 -0500
Nathan Sidwell wrote:
> Steven Munroe wrote:
>
>> add the --with-cpu support as a way to manage this. You all might be
>> very annoyed with me, if I had 970/power4/power5/power5+/power6/power6x
>> specific code cluttering up performance sensitive common code!
>
> This patch is not cluttering up performance sensitive common code.
> memset already has a check for whether the cache line size is zero.
> Please don't construct straw men :)
>
Memset is performance sensitive code and the dynamic __cache_line_size
check is slowing down 970/power4/power5/power5+/power6/power6x. When the
processor can retire (up to) 5 instructions per cycle, the dependent
sequence to address the GOT and check the __cache_line_size is very
noticable.
But I wrote the original __cache_line_size patch and negociated its
acceptance into glibc to support the larger community. The cpu specific
optimizations that I need, has to be handled differently to avoid
negatively impacting the larger community. Thus the --with-cpu mechanism.
> There is no hard and fast boundary where --with-cpu is the appropriate
> thing and where a dynamic check is the appropriate thing. For
> instance, it would be technically feasible to hard code the cache-line
> size from a --with cpu value, but that's not the approach that has
> been taken. The cache_line size is being set dynamically, and for
> 8xx CPUs it's being set to the wrong value (due to a bug in the 8xx).
> IMHO, given that the line-size check is dynamic, the right thing to do
> is dynamically check for 8xx.
>
This is chip specific and 32-bit specific so does not belong in the
trunc. I think this should be a hard and fast rule. If you insist on the
dynamic approach then copy libc-start.c to
ports/sysdeps/unix/sysv/linux/powerpc/powerpc32/ and add the 8xx
specific hack there to zero the __cache_line_size.
No other changes are necessary.
> That said, if you still disagree, I'll find another solution.
>
> nathan
>