[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[patches] fix ev500v2 build failure
- To: patches@xxxxxxxxxx
- Subject: [patches] fix ev500v2 build failure
- From: Nathan Froyd <froydnj@xxxxxxxxxxxxxxxx>
- Date: Mon, 12 Mar 2007 07:22:52 -0700
The attached patch fixes a build failure with recent versions of GCC
when compiling eglibc for the PPC e500v2. The failure mode is that when
compiling:
/* Increment the counter. */
atomic_increment (&fromp->here->count);
from elf/dl-profile.c, which expands to:
__asm__ __volatile__("1: lwarx %0,0,%2\n"
" addi %0,%0,1\n"
" stwcx. %0,0,%2\n"
" bne- 1b"
:"=&b" __val, "=m" *&fromp->here->count
:"b" &fromp->here->count, "m" *&fromp->here->count
:"memory", "cr0");
GCC does not consider the output expression '*&fromp->here->count' to be
properly aligned (the e500 is a STRICT_ALIGNMENT target) and generates a
byte-wise load for the expression. This places it in a register, rather
than in memory, as the constraint requires, which then causes an ICE.
'fromp' is a 'struct here_fromstruct':
struct here_fromstruct
{
struct here_cg_arc_record volatile *here;
uint16_t link;
};
'struct here_cg_arc_record' is defined as follows:
struct here_cg_arc_record
{
uintptr_t from_pc;
uintptr_t self_pc;
uint32_t count;
} __attribute__ ((packed));
The fix is to declare the 'count' member as being properly aligned
despite the 'packed' attribute declared on the containing structure:
struct here_cg_arc_record
{
uintptr_t from_pc;
uintptr_t self_pc;
uint32_t count __attribute__((aligned(__alignof__(uint32_t))));
} __attribute__ ((packed));
Otherwise GCC believes that 'count' could fall on a non-aligned address
and therefore exhibit the failure described above. However, we need to
show that 'count' will always fall on a 'uint32_t' boundary.
Working backwards, the expected size of the data structure 'struct
here_cg_arc_record' resides in is:
expected_size = (sizeof (struct gmon_hdr)
+ 4 + sizeof (struct gmon_hist_hdr) + kcountsize
+ 4 + 4 + fromssize * sizeof (struct here_cg_arc_record));
Everything in this expression is a multiple of four, except for
'kcountsize'. If it weren't for the 'kcountsize' in there, then the
members of 'struct here_cg_arc_record' would always be at least
four-byte aligned and we would have no issues. The question, then, is
whether 'kcountsize' is a multiple of four.
Looking at the source code, 'kcountsize' is determined by the following
bits:
lowpc = ROUNDDOWN (mapstart + GL(dl_profile_map)->l_addr,
HISTFRACTION * sizeof (HISTCOUNTER));
highpc = ROUNDUP (mapend + GL(dl_profile_map)->l_addr,
HISTFRACTION * sizeof (HISTCOUNTER));
textsize = highpc - lowpc;
kcountsize = textsize / HISTFRACTION;
'mapstart' and 'mapend' are page-aligned values, so they will be
multiples of four in any reasonable operating environment. HISTFRACTION
is 2 and HISTCOUNTER is 'unsigned short', so 'HISTFRACTION * sizeof
(HISTCOUNTER)' will be 4. So 'lowpc' and 'highpc' will be four-byte
aligned, as will 'textsize'.
The problem, then, is that 'textsize' is divided by 2, giving
'kcountsize' only two-byte alignment. However, if we assume that
'GL(dl_profile_map)->l_addr' is *also* page-aligned and that
page-alignment implies at least eight-byte alignment, then 'lowpc' and
'highpc' also have at least eight-byte alignment. It follows, then,
that kcountsize has at least four-byte alignment. Which means that the
bits of 'struct here_arc_cg_record' always wind up on at least four-byte
boundaries.
And so we have shown that placing an '__attribute__ ((aligned
(__alignof__(uint32_t))))' on the 'count' member in 'struct
here_cg_arc_record' only serves to inform the compiler of how things
will look at runtime so the compiler can generate proper code.
OK to commit?
-Nathan
2007-03-12 Nathan Froyd <froydnj@xxxxxxxxxxxxxxxx>
* elf/dl-profile.c (struct here_cg_arc_record): Declare 'count'
as being properly aligned.
Index: elf/dl-profile.c
===================================================================
--- elf/dl-profile.c (revision 165331)
+++ elf/dl-profile.c (working copy)
@@ -131,7 +131,7 @@ struct here_cg_arc_record
{
uintptr_t from_pc;
uintptr_t self_pc;
- uint32_t count;
+ uint32_t count __attribute__((aligned(__alignof__(uint32_t))));
} __attribute__ ((packed));
static struct here_cg_arc_record *data;