[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UeOGPQO0viM0w3JD71357uj7uuZ+wX2tCUTP6u_ZQG78Q@mail.gmail.com>
Date: Fri, 15 Feb 2019 10:35:18 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: Jann Horn <jannh@...gle.com>
Cc: Network Development <netdev@...r.kernel.org>,
kernel list <linux-kernel@...r.kernel.org>
Subject: Re: [RESEND PATCH net] mm: page_alloc: fix ref bias in
page_frag_alloc() for 1-byte allocs
On Fri, Feb 15, 2019 at 6:10 AM Jann Horn <jannh@...gle.com> wrote:
>
> On Thu, Feb 14, 2019 at 11:21 PM David Miller <davem@...emloft.net> wrote:
> >
> > From: Jann Horn <jannh@...gle.com>
> > Date: Thu, 14 Feb 2019 22:26:22 +0100
> >
> > > On Thu, Feb 14, 2019 at 6:13 PM David Miller <davem@...emloft.net> wrote:
> > >>
> > >> From: Jann Horn <jannh@...gle.com>
> > >> Date: Wed, 13 Feb 2019 22:45:59 +0100
> > >>
> > >> > The basic idea behind ->pagecnt_bias is: If we pre-allocate the maximum
> > >> > number of references that we might need to create in the fastpath later,
> > >> > the bump-allocation fastpath only has to modify the non-atomic bias value
> > >> > that tracks the number of extra references we hold instead of the atomic
> > >> > refcount. The maximum number of allocations we can serve (under the
> > >> > assumption that no allocation is made with size 0) is nc->size, so that's
> > >> > the bias used.
> > >> >
> > >> > However, even when all memory in the allocation has been given away, a
> > >> > reference to the page is still held; and in the `offset < 0` slowpath, the
> > >> > page may be reused if everyone else has dropped their references.
> > >> > This means that the necessary number of references is actually
> > >> > `nc->size+1`.
> > >> >
> > >> > Luckily, from a quick grep, it looks like the only path that can call
> > >> > page_frag_alloc(fragsz=1) is TAP with the IFF_NAPI_FRAGS flag, which
> > >> > requires CAP_NET_ADMIN in the init namespace and is only intended to be
> > >> > used for kernel testing and fuzzing.
> > >> >
> > >> > To test for this issue, put a `WARN_ON(page_ref_count(page) == 0)` in the
> > >> > `offset < 0` path, below the virt_to_page() call, and then repeatedly call
> > >> > writev() on a TAP device with IFF_TAP|IFF_NO_PI|IFF_NAPI_FRAGS|IFF_NAPI,
> > >> > with a vector consisting of 15 elements containing 1 byte each.
> > >> >
> > >> > Signed-off-by: Jann Horn <jannh@...gle.com>
> > >>
> > >> Applied and queued up for -stable.
> > >
> > > I had sent a v2 at Alexander Duyck's request an hour before you
> > > applied the patch (with a minor difference that, in Alexander's
> > > opinion, might be slightly more efficient). I guess the net tree
> > > doesn't work like the mm tree, where patches can get removed and
> > > replaced with newer versions? So if Alexander wants that change
> > > (s/size/PAGE_FRAG_CACHE_MAX_SIZE/ in the refcount), someone has to
> > > send that as a separate patch?
> >
> > Yes, please send a follow-up. Sorry about that.
>
> @Alexander Do you want to do that? It was your idea and I don't think
> I can reasonably judge the usefulness of the change.
I'll take care of it. I'm kind of annoyed that you resubmitted this to
netdev before anyone had a chance to even provide review comments
though.
As is this doesn't really address the issue anyway since the bigger
issue is the data alignment issue that I pointed out. I'll have
patches for both ready shortly.
- Alex
Powered by blists - more mailing lists