lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20180213122940.GS3443@dhcp22.suse.cz>
Date:   Tue, 13 Feb 2018 13:29:40 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Chris Wilson <chris@...is-wilson.co.uk>
Cc:     linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] khungtaskd: Kick stuck processes

On Tue 13-02-18 12:08:12, Chris Wilson wrote:
> Quoting Michal Hocko (2018-02-13 11:56:42)
> > On Thu 08-02-18 19:07:53, Chris Wilson wrote:
> > > After spotting a stuck process, and having decided not to panic, give
> > > the task a kick to see if that helps it to recover (e.g. to paper over a
> > > missed wake up).
> > 
> > huh, this is just no-no. watchdog is there to report problems not
> > interfere. You cannot never know whether the sleeper is prepared for
> > spurious wakeups. Do not paper over bugs...
> 
> Aside from khungtaskd being a debug feature, we want to identify the bug
> by kicking the stuck process and seeing what squeals. Being told that
> khugepaged is stuck over and over again doesn't help resolve who is
> holding onto that lock_page, or if it was just a missed wakeup as all
> other processes are asleep.

And how exactly does kicking helps here? If the waiter uses lock_page
then it would go sleep again because of PG_locked. If the page is not
locked and this is a missed wake up then either unlock_page is wrong
(which doesn't seem to be the case AFAICS) or somebody messes up with
the page locking and this patch doesn't achieve anything.

> We are trying to paper over other bugs so that we can fix ours.

But you do not want to break existing code which might be sensible to
spurious wakeups. You could argue that such a code is broken already and
I would tend to agree, but an artificial wake up is just nogo.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ