linux-kernel - Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171116005622.GC12222@bbox>
Date:   Thu, 16 Nov 2017 09:56:22 +0900
From:   Minchan Kim <minchan@...nel.org>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        ying.huang@...el.com, mgorman@...hsingularity.net,
        vdavydov.dev@...il.com, hannes@...xchg.org,
        akpm@...ux-foundation.org, shakeelb@...gle.com, gthelen@...gle.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.

On Wed, Nov 15, 2017 at 12:51:43PM +0100, Michal Hocko wrote:
< snip >
> > Since it is possible for a local unpriviledged user to lockup the system at least
> > due to mute_trylock(&oom_lock) versus (printk() or schedule_timeout_killable(1)),
> > I suggest completely eliminating scheduling priority problem (i.e. a very low
> > scheduling priority thread might take 100 seconds inside some do_shrink_slab()
> > call) by not relying on an assumption of shortly returning from do_shrink_slab().
> > My first patch + my second patch will eliminate relying on such assumption, and
> > avoid potential khungtaskd warnings.
> 
> It doesn't, because the priority issues will be still there when anybody
> can preempt your shrinker for extensive amount of time. So no you are
> not fixing the problem. You are merely make it less probable and limited
> only to the removed shrinker. You still do not have any control over
> what happens while that shrinker is executed, though.
> 
> Anyway, I do not claim your patch is a wrong approach. It is just quite
> complex and maybe unnecessarily so for most workloads. Therefore going
> with a simpler solution should be preferred until we see it
> insufficient.

That's exactly what I intended.

Try simple one firstly. Then, wait until the simple one is broken.
If it is broken, we can add more complicated thing this time.

By that model, we are going forwad to complicated stuff with good
justification without losing the chance to understand/learn new workload.