[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UfsLFuCK0vQF70s=8XC8qwrzxag_NR2dCDvxqx84E0K=g@mail.gmail.com>
Date: Thu, 26 Jan 2023 11:48:08 -0800
From: Alexander Duyck <alexander.duyck@...il.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: nbd@....name, davem@...emloft.net, edumazet@...gle.com,
hawk@...nel.org, ilias.apalodimas@...aro.org, kuba@...nel.org,
linux-kernel@...r.kernel.org, linyunsheng@...wei.com,
lorenzo@...nel.org, netdev@...r.kernel.org, pabeni@...hat.com
Subject: Re: [net PATCH] skb: Do mix page pool and page referenced frags in GRO
On Thu, Jan 26, 2023 at 11:14 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>
> Alexander Duyck <alexander.duyck@...il.com> writes:
>
> > From: Alexander Duyck <alexanderduyck@...com>
> >
> > GSO should not merge page pool recycled frames with standard reference
> > counted frames. Traditionally this didn't occur, at least not often.
> > However as we start looking at adding support for wireless adapters there
> > becomes the potential to mix the two due to A-MSDU repartitioning frames in
> > the receive path. There are possibly other places where this may have
> > occurred however I suspect they must be few and far between as we have not
> > seen this issue until now.
> >
> > Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool")
> > Reported-by: Felix Fietkau <nbd@....name>
> > Signed-off-by: Alexander Duyck <alexanderduyck@...com>
>
> I know I'm pattern matching a bit crudely here, but we recently had
> another report where doing a get_page() on skb->head didn't seem to be
> enough; any chance they might be related?
>
> See: https://lore.kernel.org/r/Y9BfknDG0LXmruDu@JNXK7M3
Looking at it I wouldn't think so. Doing get_page() on these frames is
fine. In the case you reference it looks like get_page() is being
called on a slab allocated skb head. So somehow a slab allocated head
is leaking through.
What is causing the issue here is that after get_page() is being
called and the fragments are moved into a non-pp_recycle skb they are
then picked out and merged back into a pp_recycle skb. As a result
what is happening is that we are seeing a reference count leak from
pp_frag_count and into refcount.
This is the quick-n-dirty fix. I am debating if we want to expand this
so that we could support the case where the donor frame is pp_recycle
but the recipient is a standard reference counted frame. Fixing that
would essentially consist of having to add logic to take the reference
on all donor frags, making certain that nr_frags on the donor skb
isn't altered, and then lastly making sure that all cases use the
NAPI_GRO_FREE path to drop the page pool counts.
Powered by blists - more mailing lists