[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[patches] Re: Possible PowerPC LIBC optimization

To: Mark Hatle <mark.hatle@xxxxxxxxxxxxx>
Subject: [patches] Re: Possible PowerPC LIBC optimization
From: "Conn Clark" <clark@xxxxxxxxxx>
Date: Tue, 15 May 2007 12:55:59 -0700

There are some further optimizations I'm playing with. I have added one moredcbt to the public_free function. I am also playing around with reorganizingthe mp_ data structure to take into account cache filling.

Currently I am having troubles benchmarking my changes by doing a glibcbuild. I cannot get consistent results right now. I know that disk access iseffecting the results.I have written the following program to flush the L1 data cache of any glibcdata and call malloc and free with a repeatable random sequence of sizes. Itis biased towards smaller memory allocations.Please suggest any improvements to my test.


#include <stdio.h>
#include <memory.h>

#include <stdlib.h>

/* for flushing mpc750 32k datacache */

int data_cache_flush_array[8193];


/*
read and/or write to each cache line in the array
call in alternating directions to keep as much data
in the cache between malloc/free calls via the LRU
discard policy.
*/
int thrash_the_cache(int up_down)
{

int i;

if (up_down) {
 i = 0;
 do{
   data_cache_flush_array[i*8] = data_cache_flush_array [(i+1)*8];
 }while (i++ <1022);
} else {
i = 1023;
 do{
   data_cache_flush_array[(i+1)*8] = data_cache_flush_array [i*8];
 }while (i-- > 0);

}return i;}

int main(void){


int x,y,z;
char *mem;
int x1,y1,z1;
char *mem1;
int x2,y2,z2;
char *mem2;
int x3,y3,z3;

char *mem3;

z= 8192;
do{
data_cache_flush_array[z]=z;

}while(z--);

srandom(7);

x = 65535;
x1 = 512;
x2 = 4096;

x3 = 256;

y = 100;
y1 = 1000;
y2 = 1000;

y3 = 100;

do {
  thrash_the_cache(1);
  z = random() % x;
  mem = malloc(z);

if(z) mem[z-1] = 0x3f;

  do {
    thrash_the_cache(0);
    z1 = random() % x1;
    mem1 = malloc(z1);

if(z1) mem1[z1-1] = 0x3f;

    do {
      thrash_the_cache(1);
      z2 = random() % x;
      mem2 = malloc(z2);

if(z2) mem2[z2-1] = 0x3f;

      do {
        thrash_the_cache(0);
        z3 = random() % x3;
        mem3 = malloc(z3);
        if(z3) mem3[z3-1] = 0x3f;
        thrash_the_cache(1);
        free(mem3);

}while( (--y3));

      thrash_the_cache(0);
      free(mem2);

}while( (--y2));

    thrash_the_cache(1);
    free(mem1);

}while( (--y1));

  thrash_the_cache(0);
  free(mem);

}while( (--y));



return 1;

}

Mark Hatle writes:

In the PowerPC community Conn Clark has been doing some interesting
optimization work in glibc.  Much of it however, doesn't seem to be
acceptable to the mainline glibc due to being very processor and

architecture specific.

The following information describes a simple change that made a large
performance improvement on the PPC 750 processor, and is believed will
make similar improvements on other PowerPC that contain the dcbt

instruction.

From Conn Clark:

To see where I made the changes just search for "dcbt". The first two
dcbt's in the functions _int_malloc and _int_free are the ones that
make the biggest difference. The rest seem to help but they fall
withing the noise margin of my test(a compile of glibc ).


Attached is the patch that Clark sent me against glibc-2.5.  I don't
think it is directly applicable as stated to glibc, however the idea

behind it appears to be sound.

There are point in the malloc/free that preloading the cache (at least
on PPC) makes sense.  So adding hooks in these locations may allow us to
configure in processor specific items that could dramatically improve

the performance on various processors.

--Mark



Conn
---------------------------------------

Conn Clark

Electronic Systems Technology
415 N. Quay Street Building B1     (509)-735-9092 ext 117

Kennewick, WA. 99336

Observation: In formal computer science advances are made
by standing on the shoulders of giants. Linux has proved
that if there are enough of you, you can advance just as
far by standing on each others toes.

Follow-Ups:
- [patches] Re: Possible PowerPC LIBC optimization
  - From: Conn Clark

References:
- [patches] Possible PowerPC LIBC optimization
  - From: Mark Hatle

Prev by Date: RE: [patches] Possible PowerPC LIBC optimization
Next by Date: Re: [patches] Possible PowerPC LIBC optimization
Previous by thread: Re: [patches] Possible PowerPC LIBC optimization
Next by thread: [patches] Re: Possible PowerPC LIBC optimization
Index(es):
- Date
- Thread