linux-ext4 - Re: Call trace in ext4_es_lru

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140923094204.GB2359@quack.suse.cz>
Date:	Tue, 23 Sep 2014 11:42:04 +0200
From:	Jan Kara <jack@...e.cz>
To:	Stefan Priebe - Profihost AG <s.priebe@...fihost.ag>
Cc:	Theodore Ts'o <tytso@....edu>, linux-ext4@...r.kernel.org,
	"p.herz@...fihost.ag >> Philipp Herz - Profihost AG" 
	<p.herz@...fihost.ag>, stable@...r.kernel.org
Subject: Re: Call trace in ext4_es_lru_add on 3.10 stable

On Tue 23-09-14 09:50:25, Stefan Priebe - Profihost AG wrote:
> 
> Am 22.09.2014 um 22:20 schrieb Theodore Ts'o:
> > On Mon, Sep 22, 2014 at 08:29:54PM +0200, Stefan Priebe wrote:
> >> Hi,
> >> Am 22.09.2014 18:47, schrieb Theodore Ts'o:
> >>> On Mon, Sep 22, 2014 at 08:56:23AM +0200, Stefan Priebe wrote:
> >>>>> That's not the whole message; you just weren't able to capture it all.
> >>>>> How are you capturing these messages, by the way?  Serial console?
> >>>>
> >>>> Sorry this was an incomplete copy and paste by me.
> >>>>
> >>>> Here is the complete output:
> >>>> [1578544.839610] BUG: soft lockup - CPU#7 stuck for 22s! [mysqld:29281]
> >>>> [1578544.893450] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
> >>>
> >>> OK, thanks, this is a known bug, where when ext4 is under heavy memory
> >>> pressure, we can end up stalling in reclaim.  This message indicates
> >>> that the system got stalled for 22 seconds, which is not good, since
> >>> it impacts the interactivity of your system, and increases the
> >>> long-tail latency of requests to servers running on your system, but
> >>> it doesn't cause any data loss or will cause any of your processes to
> >>> crash or otherwise stop functioning (except for temporarily).
> >>>
> >>> It's something that we are working on, and there are patches which
> >>> Zheng Liu submitted that still need a bit of polishing, but I hope to
> >>> have it addressed soon.
> >>
> >> Thanks for your feedback. Will those patches go to stable? Any link to
> >> those patches?
> > 
> > I'm not sure they will go to Stable when they are ready, because the
> > patches are somewhat complex and so they may not apply cleanly to much
> > older kernels.
> > 
> > The patches under discussion (some have been applied, others hae been
> > waiting for some requested changes) can be found here:
> > 
> > http://patchwork.ozlabs.org/patch/377720
> > http://patchwork.ozlabs.org/patch/377721
> > http://patchwork.ozlabs.org/patch/377722
> > http://patchwork.ozlabs.org/patch/377723
> > http://patchwork.ozlabs.org/patch/377724
> > http://patchwork.ozlabs.org/patch/377725
> > http://patchwork.ozlabs.org/patch/377727
> 
> hui that's a lot. Are they ALL needed to fix this?
  Yes, all of them are needed.

> No workaround possible?
  I don't know about any.

> What will Redhat do with their 3.10 RHEL 7 kernel?
  Well, I cannot speak for RH guys but for SLES if there's a customer
request, we'll just go and backport the patches...
								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html