lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Jan 2017 14:21:24 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        David Rientjes <rientjes@...gle.com>,
        Mel Gorman <mgorman@...e.de>,
        Johannes Weiner <hannes@...xchg.org>,
        Al Viro <viro@...iv.linux.org.uk>, linux-mm@...ck.org,
        LKML <linux-kernel@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Anatoly Stepanov <astepanov@...udlinux.com>,
        Andreas Dilger <adilger@...ger.ca>,
        Andreas Dilger <andreas.dilger@...el.com>,
        Anton Vorontsov <anton@...msg.org>,
        Ben Skeggs <bskeggs@...hat.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Colin Cross <ccross@...roid.com>,
        Dan Williams <dan.j.williams@...el.com>,
        David Sterba <dsterba@...e.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Hariprasad S <hariprasad@...lsio.com>,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Ilya Dryomov <idryomov@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Kent Overstreet <kent.overstreet@...il.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Mike Snitzer <snitzer@...hat.com>,
        Oleg Drokin <oleg.drokin@...el.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Santosh Raspatur <santosh@...lsio.com>,
        Tariq Toukan <tariqt@...lanox.com>,
        Theodore Ts'o <tytso@....edu>,
        Tom Herbert <tom@...bertland.com>,
        Tony Luck <tony.luck@...el.com>,
        "Yan, Zheng" <zyan@...hat.com>,
        Yishai Hadas <yishaih@...lanox.com>,
        Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH 0/6 v3] kvmalloc

On Wed 25-01-17 14:10:06, Michal Hocko wrote:
> On Tue 24-01-17 11:17:21, Alexei Starovoitov wrote:
> > On Tue, Jan 24, 2017 at 04:17:52PM +0100, Michal Hocko wrote:
> > > On Thu 12-01-17 16:37:11, Michal Hocko wrote:
> > > > Hi,
> > > > this has been previously posted as a single patch [1] but later on more
> > > > built on top. It turned out that there are users who would like to have
> > > > __GFP_REPEAT semantic. This is currently implemented for costly >64B
> > > > requests. Doing the same for smaller requests would require to redefine
> > > > __GFP_REPEAT semantic in the page allocator which is out of scope of
> > > > this series.
> > > > 
> > > > There are many open coded kmalloc with vmalloc fallback instances in
> > > > the tree.  Most of them are not careful enough or simply do not care
> > > > about the underlying semantic of the kmalloc/page allocator which means
> > > > that a) some vmalloc fallbacks are basically unreachable because the
> > > > kmalloc part will keep retrying until it succeeds b) the page allocator
> > > > can invoke a really disruptive steps like the OOM killer to move forward
> > > > which doesn't sound appropriate when we consider that the vmalloc
> > > > fallback is available.
> > > > 
> > > > As it can be seen implementing kvmalloc requires quite an intimate
> > > > knowledge if the page allocator and the memory reclaim internals which
> > > > strongly suggests that a helper should be implemented in the memory
> > > > subsystem proper.
> > > > 
> > > > Most callers I could find have been converted to use the helper instead.
> > > > This is patch 5. There are some more relying on __GFP_REPEAT in the
> > > > networking stack which I have converted as well but considering we do
> > > > not have a support for __GFP_REPEAT for requests smaller than 64kB I
> > > > have marked it RFC.
> > > 
> > > Are there any more comments? I would really appreciate to hear from
> > > networking folks before I resubmit the series.
> > 
> > while this patchset was baking the bpf side switched to use bpf_map_area_alloc()
> > which fixes the issue with missing __GFP_NORETRY that we had to fix quickly.
> > See commit d407bd25a204 ("bpf: don't trigger OOM killer under pressure with map alloc")
> > it covers all kmalloc/vmalloc pairs instead of just one place as in this set.
> > So please rebase and switch bpf_map_area_alloc() to use kvmalloc().
> 
> OK, will do. Thanks for the heads up.

Just for the record, I will fold the following into the patch 1
---
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 19b6129eab23..8697f43cf93c 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -53,21 +53,7 @@ void bpf_register_map_type(struct bpf_map_type_list *tl)
 
 void *bpf_map_area_alloc(size_t size)
 {
-	/* We definitely need __GFP_NORETRY, so OOM killer doesn't
-	 * trigger under memory pressure as we really just want to
-	 * fail instead.
-	 */
-	const gfp_t flags = __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO;
-	void *area;
-
-	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
-		area = kmalloc(size, GFP_USER | flags);
-		if (area != NULL)
-			return area;
-	}
-
-	return __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM | flags,
-			 PAGE_KERNEL);
+	return kvzalloc(size, GFP_USER);
 }
 
 void bpf_map_area_free(void *area)

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists