lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 Apr 2018 14:12:07 +0200
From:   Jan Kara <jack@...e.cz>
To:     Pavlos Parissis <pavlos.parissis@...il.com>
Cc:     Jan Kara <jack@...e.cz>, Guillaume Morin <guillaume@...infr.org>,
        stable@...r.kernel.org, decui@...rosoft.com, jack@...e.com,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        mszeredi@...hat.com
Subject: Re: kernel panics with 4.14.X versions

On Tue 17-04-18 01:31:24, Pavlos Parissis wrote:
> On 16/04/2018 04:40 μμ, Jan Kara wrote:

<snip>

> > How easily can you hit this?
> 
> Very easily, I only need to wait 1-2 days for a crash to occur.

I wouldn't call that very easily but opinions may differ :). Anyway it's
good (at least for debugging) that it's reproducible.

> > Are you able to run debug kernels
> 
> Well, I was under the impression I do as I have:
>   grep -E 'DEBUG_KERNEL|DEBUG_INFO' /boot/config-4.14.32-1.el7.x86_64
>   CONFIG_DEBUG_INFO=y
>   # CONFIG_DEBUG_INFO_REDUCED is not set
>   # CONFIG_DEBUG_INFO_SPLIT is not set
>   # CONFIG_DEBUG_INFO_DWARF4 is not set
>   CONFIG_DEBUG_KERNEL=y
> 
> Do you think that my kernel doesn't produce a proper crash dump?
> I have a production cluster where I can run any kernel we need, so if I need
> to compile again with different settings I can certainly do that.

OK, good. So please try running 4.16 as you mention below to verify whether
this is just a -stable regression or also a problem in the current upstream
kernel. Based on your results with 4.16 I'll prepare a debug patch for you to
apply on top of 4.14.32 so that we can debug this further.

> > / inspect
> > crash dumps when the issue occurs?
> 
> I can't do that as the server isn't responsive and I can only power cycle it.

Well, kernel crash dumps work in that situation as well - when the kernel
panics, it will kexec into a new kernel and dump memory of the old kernel
to disk. It can then be investigated with the 'crash' utility. But
obviously you don't have this set up and don't have experience with this so
let's go via a standard 'debug patch' route.

> > Also testing with the latest mainline
> > kernel (4.16) would be welcome whether this isn't just an issue with the
> > backport of fsnotify fixes from Miklos.
> 
> I can try the kernel-ml-4.16.2 from elrepo (we use CentOS 7).

Yes, that would be good.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ