linux-kernel - Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKgT0UfOr7U-8T+Hr9NVPL7EMYaTzbx7w1-hUthjD9bXUFsqMw@mail.gmail.com>
Date:   Thu, 15 Jul 2021 07:57:57 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Ilias Apalodimas <ilias.apalodimas@...aro.org>
Cc:     Yunsheng Lin <linyunsheng@...wei.com>,
        Netdev <netdev@...r.kernel.org>,
        Alexander Duyck <alexanderduyck@...com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Alexander Lobakin <alobakin@...me>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        Willem de Bruijn <willemb@...gle.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Guillaume Nault <gnault@...hat.com>,
        Cong Wang <cong.wang@...edance.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Matteo Croce <mcroce@...rosoft.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling
 page_pool packets

On Thu, Jul 15, 2021 at 7:45 AM Ilias Apalodimas
<ilias.apalodimas@...aro.org> wrote:
>
> > > >           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>
> [...]
>
> > > >                             &shinfo->dataref))
> > > > -             return;
> > > > +             goto exit;
> > >
> > > Is it possible this patch may break the head frag page for the original skb,
> > > supposing it's head frag page is from the page pool and below change clears
> > > the pp_recycle for original skb, causing a page leaking for the page pool?
> >
> > I don't see how. The assumption here is that when atomic_sub_return
> > gets down to 0 we will still have an skb with skb->pp_recycle set and
> > it will flow down and encounter skb_free_head below. All we are doing
> > is skipping those steps and clearing skb->pp_recycle for all but the
> > last buffer and the last one to free it will trigger the recycling.
>
> I think the assumption here is that
> 1. We clone an skb
> 2. The original skb goes into pskb_expand_head()
> 3. skb_release_data() will be called for the original skb
>
> But with the dataref bumped, we'll skip the recycling for it but we'll also
> skip recycling or unmapping the current head (which is a page_pool mapped
> buffer)

Right, but in that case it is the clone that is left holding the
original head and the skb->pp_recycle flag is set on the clone as it
was copied from the original when we cloned it. What we have
essentially done is transferred the responsibility for freeing it from
the original to the clone.

If you think about it the result is the same as if step 2 was to go
into kfree_skb. We would still be calling skb_release_data and the
dataref would be decremented without the original freeing the page. We
have to wait until all the clones are freed and dataref reaches 0
before the head can be recycled.