lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 Oct 2018 10:54:45 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Petr Mladek <pmladek@...e.com>
Cc:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Johannes Weiner <hannes@...xchg.org>, linux-mm@...ck.org,
        syzkaller-bugs@...glegroups.com, guro@...com,
        kirill.shutemov@...ux.intel.com, linux-kernel@...r.kernel.org,
        rientjes@...gle.com, yang.s@...baba-inc.com,
        Andrew Morton <akpm@...ux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        syzbot <syzbot+77e6b28a7a7106ad0def@...kaller.appspotmail.com>
Subject: Re: [PATCH v3] mm: memcontrol: Don't flood OOM messages with no
 eligible task.

[I strongly suspect this whole email thread went way out of scope of the
 issue really deserves]

I didn't want to participate any further but let me clarify one thing
because I can see how the discussion could generate some confusion.

On Tue 23-10-18 10:37:38, Petr Mladek wrote:
[...]
> My understanding is that this situation happens when the system is
> misconfigured and unusable without manual intervention. If
> the user is able to see what the problem is then we are good.

Not really. The flood of _memcg_ oom report about no eligible tasks
should indeed happen only when the memcg is misconfigured. The system is
and should be still usable at this stage. Ratelimit is aimed to reduce
pointless message which do not help to debug the issue itself much.
There is a race condition as explained by Tetsuo that could lead to this
situation even without a misconfiguration and that is clearly a bug and
something to deal with and patches have been posted in that regards [1]

The rest of the discussion is about how to handle printk rate-limiting
properly and whether ad-hoc solution is more appropriate than a real API
we have in place and whether the later needs some enhancements. That is
completely orthogonal on the issue at hands and as such it should be
really discussed separately.

[1] http://lkml.kernel.org/r/20181022071323.9550-1-mhocko@kernel.org
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ