[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180130140104.GE21609@dhcp22.suse.cz>
Date: Tue, 30 Jan 2018 15:01:04 +0100
From: Michal Hocko <mhocko@...nel.org>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: "Kirill A. Shutemov" <kirill@...temov.name>,
Florian Westphal <fw@...len.de>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
David Miller <davem@...emloft.net>,
netfilter-devel@...r.kernel.org, coreteam@...filter.org,
netdev <netdev@...r.kernel.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Yang Shi <yang.s@...baba-inc.com>,
syzkaller-bugs@...glegroups.com,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>, Linux-MM <linux-mm@...ck.org>,
David Rientjes <rientjes@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>, guro@...com,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [netfilter-core] kernel panic: Out of memory and no killable
processes... (2)
On Tue 30-01-18 10:57:39, Michal Hocko wrote:
> On Tue 30-01-18 10:02:34, Dmitry Vyukov wrote:
> > On Tue, Jan 30, 2018 at 9:28 AM, Kirill A. Shutemov
> > <kirill@...temov.name> wrote:
> > > On Tue, Jan 30, 2018 at 09:11:27AM +0100, Florian Westphal wrote:
> > >> Michal Hocko <mhocko@...nel.org> wrote:
> > >> > On Mon 29-01-18 23:35:22, Florian Westphal wrote:
> > >> > > Kirill A. Shutemov <kirill@...temov.name> wrote:
> > >> > [...]
> > >> > > > I hate what I'm saying, but I guess we need some tunable here.
> > >> > > > Not sure what exactly.
> > >> > >
> > >> > > Would memcg help?
> > >> >
> > >> > That really depends. I would have to check whether vmalloc path obeys
> > >> > __GFP_ACCOUNT (I suspect it does except for page tables allocations but
> > >> > that shouldn't be a big deal). But then the other potential problem is
> > >> > the life time of the xt_table_info (or other potentially large) data
> > >> > structures. Are they bound to any process life time.
> > >>
> > >> No.
> > >
> > > Well, IIUC they bound to net namespace life time, so killing all
> > > proccesses in the namespace would help to get memory back. :)
> >
> > ... unless the namespace is mounted into file system.
> >
> > Let's start with NOWARN as that's what kernel generally uses for
> > allocations with user-controllable size. ENOMEM is roughly as
> > informative as the WARNING message in this case.
>
> You want __GFP_NORETRY but that is not _fully_ supported by kvmalloc
> right now. More specifically kvmalloc doesn't guanratee that the request
> will not trigger the OOM killer (like regular __GFP_NORETRY). This is
> because of internal vmalloc restrictions. If you are however OK to
> simply bail out in most cases then __GFP_NORETRY should work reasonably
> fine.
>
> > I think we also need to consider setting up memory cgroup for
> > syzkaller test processes (we do RLIMIT_AS, but that's weak).
>
> Well, this is not about syzkaller, it merely pointed out a potential
> DoS... And that has to be addressed somehow.
So how about this?
---
>From d48e950f1b04f234b57b9e34c363bdcfec10aeee Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@...e.com>
Date: Tue, 30 Jan 2018 14:51:07 +0100
Subject: [PATCH] net/netfilter/x_tables.c: make allocation less aggressive
syzbot has noticed that xt_alloc_table_info can allocate a lot of
memory. This is an admin only interface but an admin in a namespace
is sufficient as well. eacd86ca3b03 ("net/netfilter/x_tables.c: use
kvmalloc() in xt_alloc_table_info()") has changed the opencoded
kmalloc->vmalloc fallback into kvmalloc. It has dropped __GFP_NORETRY on
the way because vmalloc has simply never fully supported __GFP_NORETRY
semantic. This is still the case because e.g. page tables backing the
vmalloc area are hardcoded GFP_KERNEL.
Revert back to __GFP_NORETRY as a poors man defence against excessively
large allocation request here. We will not rule out the OOM killer
completely but __GFP_NORETRY should at least stop the large request
in most cases.
Fixes: eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() in xt_alloc_table_info()")
Signed-off-by: Michal Hocko <mhocko@...e.com>
---
net/netfilter/x_tables.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index d8571f414208..a5f5c29bcbdc 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -1003,7 +1003,13 @@ struct xt_table_info *xt_alloc_table_info(unsigned int size)
if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages)
return NULL;
- info = kvmalloc(sz, GFP_KERNEL);
+ /*
+ * __GFP_NORETRY is not fully supported by kvmalloc but it should
+ * work reasonably well if sz is too large and bail out rather
+ * than shoot all processes down before realizing there is nothing
+ * more to reclaim.
+ */
+ info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY);
if (!info)
return NULL;
--
2.15.1
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists