linux-kernel - Re: Socket-related problem in x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <46E6D660.16004.15789926@Ulrich.Windl.rkdvmks1.ngate.uni-regensburg.de>
Date:	Tue, 11 Sep 2007 17:54:38 +0200
From:	"Ulrich Windl" <ulrich.windl@...uni-regensburg.de>
To:	Eric Dumazet <dada1@...mosbay.com>
CC:	linux-kernel@...r.kernel.org
Subject: Re: Socket-related problem in x86_64 Kernel (2.6.16.53-0.8-smp)?

On 11 Sep 2007 at 15:01, Eric Dumazet wrote:

> On Tue, 11 Sep 2007 11:30:38 +0200
> "Ulrich Windl" <ulrich.windl@...uni-regensburg.de> wrote:
> 
> > Hi,
> > 
> > since upgrading from SLES9 SP3 to SLES10 SP1 I see kernel segfaults which seem 
> > network-related: Most notably slapd does not run any more, and my sendmail-milter 
> > based virus scanner terminates now and then with kernel segfault.
> > 
> > Current kernel form SLES10 SP1 is: 
> > 
> > # cat /proc/version
> > Linux version 2.6.16.53-0.8-smp (geeko@...ldhost) (gcc version 4.1.2 20070115 
> > (prerelease) (SUSE Linux)) #1 SMP Fri Aug 31 13:07:27 UTC 2007
> > 
> > The effects in syslog are:
> > Aug 31 15:04:40 kgate1 kernel: powersaved[10102]: segfault at 0000000000000008 rip 
> > 000000000042c17a rsp 00007fffea55de00 error 4
[...]
> segfaulting are sysloged only on 64bits kernel.
> 
> Maybe your slapd/hscan processes are doing bad things, that make them 
> core dump without notice on a 32bits kernel.

A very wild guess: AFAIK SUSE Distributions are XENified recently, that is they 
have libraries that treat thread local storage differently from the default. If 
these programs (powersaved, slapd, hscan) are all multithreaded, could it be that 
the cause of the problem is in that area?

If not, any clues on debugging/tracing? There's a 
/usr/src/linux/Documentation/oops-tracing.txt, but no "segfault-tracing".

I also learned that the error code is only documented for i386 arch (thanks to 
Emacs ediff):
 * error_code:
 *      bit 0 == 0 means no page found, 1 means protection fault
 *      bit 1 == 0 means read, 1 means write
 *      bit 2 == 0 means kernel, 1 means user-mode

So the problem (error 4) looks a bit like a read on a NULL-pointer dereference, 
right? And the "rip" is user space, correct?

Regards,
Ulrich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/