lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Nov 2017 09:56:22 +0900
From:   Minchan Kim <minchan@...nel.org>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        ying.huang@...el.com, mgorman@...hsingularity.net,
        vdavydov.dev@...il.com, hannes@...xchg.org,
        akpm@...ux-foundation.org, shakeelb@...gle.com, gthelen@...gle.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.

On Wed, Nov 15, 2017 at 12:51:43PM +0100, Michal Hocko wrote:
< snip >
> > Since it is possible for a local unpriviledged user to lockup the system at least
> > due to mute_trylock(&oom_lock) versus (printk() or schedule_timeout_killable(1)),
> > I suggest completely eliminating scheduling priority problem (i.e. a very low
> > scheduling priority thread might take 100 seconds inside some do_shrink_slab()
> > call) by not relying on an assumption of shortly returning from do_shrink_slab().
> > My first patch + my second patch will eliminate relying on such assumption, and
> > avoid potential khungtaskd warnings.
> 
> It doesn't, because the priority issues will be still there when anybody
> can preempt your shrinker for extensive amount of time. So no you are
> not fixing the problem. You are merely make it less probable and limited
> only to the removed shrinker. You still do not have any control over
> what happens while that shrinker is executed, though.
> 
> Anyway, I do not claim your patch is a wrong approach. It is just quite
> complex and maybe unnecessarily so for most workloads. Therefore going
> with a simpler solution should be preferred until we see it
> insufficient.

That's exactly what I intended.

Try simple one firstly. Then, wait until the simple one is broken.
If it is broken, we can add more complicated thing this time.

By that model, we are going forwad to complicated stuff with good
justification without losing the chance to understand/learn new workload.

Powered by blists - more mailing lists