linux-kernel - Re: [PATCH] fs: avoid locking sb_lock in grab_super

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALYGNiPCndnuJpfROdeP=a2cWofG5R1nXPPRegc8UYL=Jc1qZA@mail.gmail.com>
Date:	Tue, 24 Feb 2015 13:19:47 +0400
From:	Konstantin Khlebnikov <koct9i@...il.com>
To:	Al Viro <viro@...iv.linux.org.uk>
Cc:	Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH] fs: avoid locking sb_lock in grab_super_passive()

On Sat, Feb 21, 2015 at 5:37 AM, Al Viro <viro@...iv.linux.org.uk> wrote:
> On Thu, Feb 19, 2015 at 08:19:35PM +0300, Konstantin Khlebnikov wrote:
>> I've noticed significant locking contention in memory reclaimer around
>> sb_lock inside grab_super_passive(). Grab_super_passive() is called from
>> two places: in icache/dcache shrinkers (function super_cache_scan) and
>> from writeback (function __writeback_inodes_wb). Both are required for
>> progress in memory reclaimer.
>>
>> Also this lock isn't irq-safe. And I've seen suspicious livelock under
>> serious memory pressure where reclaimer was called from interrupt which
>> have happened right in place where sb_lock is held in normal context,
>> so all other cpus were stuck on that lock too.
>
> Excuse me, but this part is BS - its call is immediately preceded by
>         if (!(sc->gfp_mask & __GFP_FS))
>                 return SHRINK_STOP;
> and if we *ever* hit GFP_FS allocation from interrupt, we are really
> screwed.  If nothing else, both prune_dcache_sb() and prune_icache_sb()
> can wait for all kinds of IO; you really don't want that called in an
> interrupt context.  The same goes for writeback_sb_inodes(), while we
> are at it.
>
> If you ever see that in an interrupt context, you have a very bad problem
> on hands.
>
> Said that, not bothering with sb_lock (and ->s_count) in those two callers
> makes sense.  Applied, with name changed to trylock_super().

Ok, thanks. I'll pull this into our kernel and try to catch livelock again.

It seems sb_lock becomes hottest lock by accident: system has no swap
and all page-cache is gone thus all cpus stuck at reclaiming inodes and
dentries. For some reason OOM killer wasn't invoked for hour or so.

Part about reclaimer called from interrupt context was BS for sure
I've mixed up some stacks from that 30Mb log of kernel's suffering.

>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/