[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7331d6d3f9044e386e425e89b1fc32d60b046cf3.camel@gmail.com>
Date: Tue, 04 Apr 2023 08:51:18 -0700
From: Alexander H Duyck <alexander.duyck@...il.com>
To: Liang Chen <liangchen.linux@...il.com>,
ilias.apalodimas@...aro.org, hawk@...nel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
On Tue, 2023-04-04 at 15:47 +0800, Liang Chen wrote:
> Commit 1effe8ca4e34 allowed coalescing to proceed with non page pool page
> and page pool page when @from is cloned, i.e.
>
> to->pp_recycle --> false
> from->pp_recycle --> true
> skb_cloned(from) --> true
>
> However, it actually requires skb_cloned(@from) to hold true until
> coalescing finishes in this situation. If the other cloned SKB is
> released while the merging is in process, from_shinfo->nr_frags will be
> set to 0 toward the end of the function, causing the increment of frag
> page _refcount to be unexpectedly skipped resulting in inconsistent
> reference counts. Later when SKB(@to) is released, it frees the page
> directly even though the page pool page is still in use, leading to
> use-after-free or double-free errors. So it should be prohibitted.
>
> The double-free error message below prompted us to investigate:
> BUG: Bad page state in process swapper/1 pfn:0e0d1
> page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000
> index:0x2 pfn:0xe0d1
> flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
> raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000
> raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000
> page dumped because: nonzero _refcount
>
> CPU: 1 PID: 0 Comm: swapper/1 Tainted: G E 6.2.0+
> Call Trace:
> <IRQ>
> dump_stack_lvl+0x32/0x50
> bad_page+0x69/0xf0
> free_pcp_prepare+0x260/0x2f0
> free_unref_page+0x20/0x1c0
> skb_release_data+0x10b/0x1a0
> napi_consume_skb+0x56/0x150
> net_rx_action+0xf0/0x350
> ? __napi_schedule+0x79/0x90
> __do_softirq+0xc8/0x2b1
> __irq_exit_rcu+0xb9/0xf0
> common_interrupt+0x82/0xa0
> </IRQ>
> <TASK>
> asm_common_interrupt+0x22/0x40
> RIP: 0010:default_idle+0xb/0x20
>
> Signed-off-by: Liang Chen <liangchen.linux@...il.com>
> ---
> net/core/skbuff.c | 17 +++++++----------
> 1 file changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 050a875d09c5..9be23ece5f03 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -5598,17 +5598,14 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
> return false;
>
> /* In general, avoid mixing slab allocated and page_pool allocated
> - * pages within the same SKB. However when @to is not pp_recycle and
> - * @from is cloned, we can transition frag pages from page_pool to
> - * reference counted.
> - *
> - * On the other hand, don't allow coalescing two pp_recycle SKBs if
> - * @from is cloned, in case the SKB is using page_pool fragment
> - * references (PP_FLAG_PAGE_FRAG). Since we only take full page
> - * references for cloned SKBs at the moment that would result in
> - * inconsistent reference counts.
> + * pages within the same SKB. However don't allow coalescing two
> + * pp_recycle SKBs if @from is cloned, in case the SKB is using
> + * page_pool fragment references (PP_FLAG_PAGE_FRAG). Since we only
> + * take full page references for cloned SKBs at the moment that would
> + * result in inconsistent reference counts.
> */
> - if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from)))
> + if ((to->pp_recycle != from->pp_recycle)
> + || (from->pp_recycle && skb_cloned(from)))
> return false;
>
> if (len <= skb_tailroom(to)) {
I'm not quite sure I agree with the fix. Couldn't we just modify the
check further down that does:
if (!skb_cloned(from))
from_shinfo->nr_frags = 0;
And instead just make that:
if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
from_shinfo->nr_frags = 0;
With that we would retain the existing behavior and in the case of
cloned from frames we would take the references and let the original
from skb freed to take care of pulling the pages from the page pool.
Powered by blists - more mailing lists