lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46E6D660.16004.15789926@Ulrich.Windl.rkdvmks1.ngate.uni-regensburg.de>
Date:	Tue, 11 Sep 2007 17:54:38 +0200
From:	"Ulrich Windl" <ulrich.windl@...uni-regensburg.de>
To:	Eric Dumazet <dada1@...mosbay.com>
CC:	linux-kernel@...r.kernel.org
Subject: Re: Socket-related problem in x86_64 Kernel (2.6.16.53-0.8-smp)?

On 11 Sep 2007 at 15:01, Eric Dumazet wrote:

> On Tue, 11 Sep 2007 11:30:38 +0200
> "Ulrich Windl" <ulrich.windl@...uni-regensburg.de> wrote:
> 
> > Hi,
> > 
> > since upgrading from SLES9 SP3 to SLES10 SP1 I see kernel segfaults which seem 
> > network-related: Most notably slapd does not run any more, and my sendmail-milter 
> > based virus scanner terminates now and then with kernel segfault.
> > 
> > Current kernel form SLES10 SP1 is: 
> > 
> > # cat /proc/version
> > Linux version 2.6.16.53-0.8-smp (geeko@...ldhost) (gcc version 4.1.2 20070115 
> > (prerelease) (SUSE Linux)) #1 SMP Fri Aug 31 13:07:27 UTC 2007
> > 
> > The effects in syslog are:
> > Aug 31 15:04:40 kgate1 kernel: powersaved[10102]: segfault at 0000000000000008 rip 
> > 000000000042c17a rsp 00007fffea55de00 error 4
[...]
> segfaulting are sysloged only on 64bits kernel.
> 
> Maybe your slapd/hscan processes are doing bad things, that make them 
> core dump without notice on a 32bits kernel.

A very wild guess: AFAIK SUSE Distributions are XENified recently, that is they 
have libraries that treat thread local storage differently from the default. If 
these programs (powersaved, slapd, hscan) are all multithreaded, could it be that 
the cause of the problem is in that area?

If not, any clues on debugging/tracing? There's a 
/usr/src/linux/Documentation/oops-tracing.txt, but no "segfault-tracing".

I also learned that the error code is only documented for i386 arch (thanks to 
Emacs ediff):
 * error_code:
 *      bit 0 == 0 means no page found, 1 means protection fault
 *      bit 1 == 0 means read, 1 means write
 *      bit 2 == 0 means kernel, 1 means user-mode

So the problem (error 4) looks a bit like a read on a NULL-pointer dereference, 
right? And the "rip" is user space, correct?

Regards,
Ulrich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ