[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1507141536590.16182@chino.kir.corp.google.com>
Date: Tue, 14 Jul 2015 15:45:40 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Dave Chinner <david@...morbit.com>
cc: Mike Snitzer <snitzer@...hat.com>,
Mikulas Patocka <mpatocka@...hat.com>,
Edward Thornber <thornber@...hat.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
dm-devel@...hat.com, Vivek Goyal <vgoyal@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"Alasdair G. Kergon" <agk@...hat.com>
Subject: Re: [PATCH 2/7] mm: introduce kvmalloc and kvmalloc_node
On Wed, 15 Jul 2015, Dave Chinner wrote:
> > Sure, but it's not accomplishing the same thing: things like
> > ext4_kvmalloc() only want to fallback to vmalloc() when high-order
> > allocations fail: the function is used for different sizes. This cannot
> > be converted to kvmalloc_node() since it fallsback immediately when
> > reclaim fails. Same issue with single_file_open() for the seq_file code.
> > We could go through every kmalloc() -> vmalloc() fallback for more
> > examples in the code, but those two instances were the first I looked at
> > and couldn't be converted to kvmalloc_node() without work.
> >
> > > It is always easier to shoehorn utility functions locally within a
> > > subsystem (be it ext4, dm, etc) but once enough do something in a
> > > similar but different way it really should get elevated.
> > >
> >
> > I would argue that
> >
> > void *ext4_kvmalloc(size_t size, gfp_t flags)
> > {
> > void *ret;
> >
> > ret = kmalloc(size, flags | __GFP_NOWARN);
> > if (!ret)
> > ret = __vmalloc(size, flags, PAGE_KERNEL);
> > return ret;
> > }
> >
> > is simple enough that we don't need to convert it to anything.
>
> Except that it will have problems with GFP_NOFS context when the pte
> code inside vmalloc does a GFP_KERNEL allocation. Hence we have
> stuff in other subsystems (such as XFS) where we've noticed lockdep
> whining about this:
>
Does anyone have an example of ext4_kvmalloc() having a lockdep violation?
Presumably the GFP_NOFS calls to ext4_kvmalloc() will never have
size > (1 << (PAGE_SHIFT + PAGE_ALLOC_COSTLY_ORDER)) so that kmalloc()
above actually never returns NULL and __vmalloc() only gets used for the
ext4_kvmalloc(..., GFP_KERNEL) call.
It should be fixed, though, probably in the same way as
kmem_zalloc_large() today, but it seems the real fix would be to attack
the whole vmalloc() GFP_KERNEL issue that has been talked about several
times in the past. Then the existing ext4_kvmalloc() implementation
should be fine.
Once that's done, we can revisit the idea of a generalized kvmalloc() or
kvmalloc_node(), but since the implementation such as above is different
from the proposed kvmalloc_node() implementation with respect to
high-order allocations, I doubt a generalized form will be helpful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists