[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180417120306.536sc3yzsndta5wb@quack2.suse.cz>
Date: Tue, 17 Apr 2018 14:03:06 +0200
From: Jan Kara <jack@...e.cz>
To: Amir Goldstein <amir73il@...il.com>
Cc: Greg KH <greg@...ah.com>, Dexuan Cui <decui@...rosoft.com>,
Jan Kara <jack@...e.cz>,
Guillaume Morin <guillaume@...infr.org>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Pavlos Parissis <pavlos.parissis@...il.com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
"jack@...e.com" <jack@...e.com>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mszeredi@...hat.com" <mszeredi@...hat.com>
Subject: Re: kernel panics with 4.14.X versions
On Tue 17-04-18 14:48:35, Amir Goldstein wrote:
> On Tue, Apr 17, 2018 at 1:33 PM, Greg KH <greg@...ah.com> wrote:
> > On Mon, Apr 16, 2018 at 09:10:35PM +0000, Dexuan Cui wrote:
> >> > From: Jan Kara <jack@...e.cz>
> >> > Sent: Monday, April 16, 2018 07:41
> >> > ...
> >> > How easily can you hit this? Are you able to run debug kernels / inspect
> >> > crash dumps when the issue occurs? Also testing with the latest mainline
> >> > kernel (4.16) would be welcome whether this isn't just an issue with the
> >> > backport of fsnotify fixes from Miklos.
> >>
> >> It's not easy for us to reproduce the fsnotify() lockup issue, and actually
> >> we still don't have an easy & reliable way to reproduce it.
> >>
> >> According to our tests, v4.16 doesn't have the issue.
> >> And v4.15 doesn't have the issue either, if I recall correctly.
> >> I only know the issue happens to v4.14.x and 4.13.x kernels
> >
> > Any chance to run 'git bisect' between 4.14 and 4.15 to find the fix?
> >
>
> Looking at the changes between 4.14 and 4.15, that are not in 4.14.32,
> the only viable suspects are:
> 9cf90cef362d fsnotify: Protect bail out path of fsnotify_add_mark_locked()
> properly
> 3427ce715541 fsnotify: clean up fsnotify()
>
> Both don't claim to fix a known issue.
Yeah, and the second one is just a code refactorization and I don't see how
the first fix could lead to anything like what's reported. So I don't think
picking these to 4.14 stable is really the right solution. We first need to
understand what's going wrong.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists