[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <lywt6cc04g.fsf_-_@ensc-pc.intern.sigma-chemnitz.de>
Date: Fri, 03 Nov 2006 11:09:35 +0100
From: Enrico Scholz <enrico.scholz@...ma-chemnitz.de>
To: linux-arm-kernel@...ts.arm.linux.org.uk
Cc: linux-kernel@...r.kernel.org, rpurdie@...ys.net
Subject: [ARM] Corrupted .got section with 2.6.18 and JFFS2 (solved)
[CC lkml; original issue at
http://article.gmane.org/gmane.linux.ports.arm.kernel/28068]
rpurdie@...ys.net (Richard Purdie) writes:
>> > I have a problem with JFFS2 filesystem and kernel 2.6.18. When
>> > starting a program which uses a certain library (libutil.so.1 in
>> > my case), the .got section of the library can be initialized
>> > wrongly when the used memory is uninitialized.
>>
>> Problem seems to be caused by
>>
>> | [PATCH] zlib_inflate: Upgrade library code to a recent version
>>
>> (4f3865fb57a04db7cca068fed1c15badc064a302)
>>
>> After reverting this (and related patches), things seem to work.
>>
>> I don't have an idea yet, which changes in this complex patch are
>> really responsible....
>
> I'm the author of the above change. I just ran your test program
> on a device (ARM PXA255 with 2.6.19-rc4 kernel, 2.3.5ish glibc,
> gcc 3.4.4, libraries on jffs2) and I can't reproduce the
> problem.
I can reproduce it 100% with:
$ git checkout -b test v2.6.17.14
$ git-am -3 tmp/000[1-8]*
(see https://www.cvg.de/people/ensc/libutil/ for patches and
used .config (config.txt); the physmap patches are from 2.6.18)
$ make tftp
--> 'fillmem ; test' sequences work without errors
$ git-cherry-pick 4f3865fb57a04db7cca068fed1c15badc064a302
$ make tftp
--> 'fillmem ; test' sequences stop with a segfault
I compiled kernel both with gcc-3.4.6 and gcc-4.1.1 and got same
results.
Same results when using recent 2.6.18.1 kernel and reverting all
patches which modified lib/zlib_*.
I see segfaults too with 2.6.19-rc4 but did not checked yet
whether removal of zlib patch solved them.
Things are getting yet more strange when using the glibc-2.5
dynamic loader:
| # ... copying ld-2.5.so and libc-2.5.so ...
| # LD_LIBRARY_PATH=`pwd` ./ld-2.5.so /bin/test2
| Inconsistency detected by ld.so: dynamic-link.h: 169: elf_get_dynamic_info: Assertion `info[19]->d_un.d_val == sizeof (Elf32_Rel)' failed!
| # LD_LIBRARY_PATH=`pwd` ./ld-2.5.so /bin/test2
| Segmentation fault
| # LD_LIBRARY_PATH=`pwd` ./ld-2.5.so /bin/test2
| #
> You mentioned elsewhere that reading the lib from flash gives
> consistent md5sums. There is only one inflation code path and
> if the md5sum is always consistent, I can't see how the the
> inflation code is at fault. I therefore strongly suspect this
> is some userspace issue when handling the got.
Issue:
* seems to be triggered by the zlib kernel patch
* seems to be triggered by my 'libutil.so' (I can not see it with
other libraries)
* can be reproduced on two different PXA270 platforms (same
userspace, but different module vendors and different memory
timing setups)
I see the following reasons:
* new zlib code has sideeffects (overflows?)
* new zlib code is so fast that it triggers a race somewhere else
* libutil.so's .init section is buggy (likely, but why does the
error not occur when libutil.so is on tmpfs or NFS?)
* new zlib code requires more/less memory bandwidth, changes
power consumption of CPU/memory which is causing random errors
(unlikely because only same part of .got table is affected and
it happens on two different platforms)
* some DCACHE issue
> Which other related patches did you remove?
For 2.6.18 tests, I reverted only the patches which changed
lib/zlib_* after 2.6.17:
| 31925c8857ba17c11129b766a980ff7c87780301 [PATCH] Fix ppc32 zImage inflate
| b762450e84e20a179ee5993b065caaad99a65fbf [PATCH] zlib inflate: fix function definitions
| 0ecbf4b5fc38479ba29149455d56c11a23b131c0 move acknowledgment for Mark Adler to CREDITS
| 4f3865fb57a04db7cca068fed1c15badc064a302 [PATCH] zlib_inflate: Upgrade library code to a recent version
Enrico
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists