netdev - Re: [Intel-wired-lan] [PATCH RFC net-next v4 3/9] iavf: drop page splitting and recycling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6310c483-8c6e-8d34-763a-487157f6ff0c@intel.com>
Date: Thu, 6 Jul 2023 18:45:14 +0200
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Alexander Duyck <alexander.duyck@...il.com>
CC: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
	<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
	<pabeni@...hat.com>, Paul Menzel <pmenzel@...gen.mpg.de>, "Jesper Dangaard
 Brouer" <hawk@...nel.org>, Larysa Zaremba <larysa.zaremba@...el.com>,
	<netdev@...r.kernel.org>, Alexander Duyck <alexanderduyck@...com>, "Ilias
 Apalodimas" <ilias.apalodimas@...aro.org>, <linux-kernel@...r.kernel.org>,
	Yunsheng Lin <linyunsheng@...wei.com>, Michal Kubiak
	<michal.kubiak@...el.com>, <intel-wired-lan@...ts.osuosl.org>, "David
 Christensen" <drc@...ux.vnet.ibm.com>
Subject: Re: [Intel-wired-lan] [PATCH RFC net-next v4 3/9] iavf: drop page
 splitting and recycling

From: Alexander Duyck <alexander.duyck@...il.com>
Date: Thu, 6 Jul 2023 07:47:03 -0700

> On Wed, Jul 5, 2023 at 8:57 AM Alexander Lobakin
> <aleksander.lobakin@...el.com> wrote:
>>
>> As an intermediate step, remove all page splitting/recyclig code. Just
> 
> Spelling issue: "recycling"

checkpatch w/codespell didn't catch this one =\

> 
>> always allocate a new page and don't touch its refcount, so that it gets
>> freed by the core stack later.
>> Same for the "in-place" recycling, i.e. when an unused buffer gets
>> assigned to a first needs-refilling descriptor. In some cases, this
>> was leading to moving up to 63 &iavf_rx_buf structures around the ring
>> on a per-field basis -- not something wanted on hotpath.
>> The change allows to greatly simplify certain parts of the code:

[...]

>> @@ -1317,21 +1200,10 @@ static void iavf_put_rx_buffer(struct iavf_ring *rx_ring,
>>         if (!rx_buffer)
>>                 return;
>>
>> -       if (iavf_can_reuse_rx_page(rx_buffer)) {
>> -               /* hand second half of page back to the ring */
>> -               iavf_reuse_rx_page(rx_ring, rx_buffer);
>> -               rx_ring->rx_stats.page_reuse_count++;
>> -       } else {
>> -               /* we are not reusing the buffer so unmap it */
>> -               dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
>> -                                    iavf_rx_pg_size(rx_ring),
>> -                                    DMA_FROM_DEVICE, IAVF_RX_DMA_ATTR);
>> -               __page_frag_cache_drain(rx_buffer->page,
>> -                                       rx_buffer->pagecnt_bias);
>> -       }
>> -
>> -       /* clear contents of buffer_info */
>> -       rx_buffer->page = NULL;
>> +       /* we are not reusing the buffer so unmap it */
>> +       dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma,
>> +                            iavf_rx_pg_size(rx_ring),
>> +                            DMA_FROM_DEVICE, IAVF_RX_DMA_ATTR);
> 
> Rather than reorder all this I would just do the dma_unmap_page_attrs
> and then leave the assignment of NULL to rx_buffer->page. It should
> make this a bit easier to clean up the code below.
> 
>>  }
>>
>>  /**
>> @@ -1431,15 +1303,18 @@ static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget)
>>                 else
>>                         skb = iavf_build_skb(rx_ring, rx_buffer, size);
>>
>> +               iavf_put_rx_buffer(rx_ring, rx_buffer);
>> +
> 
> This should stay below where it was.

Wait-wait-wait.

if (!skb) break breaks the loop. put_rx_buffer() unmaps the page.
So in order to do the first, you need to do the second to avoid leaks.
Or you meant "why unmapping and freeing if we fail, just leave it in
place"? To make it easier to switch to Page Pool.

> 
>>                 /* exit if we failed to retrieve a buffer */
>>                 if (!skb) {
>>                         rx_ring->rx_stats.alloc_buff_failed++;
>> -                       if (rx_buffer && size)
>> -                               rx_buffer->pagecnt_bias++;
>> +                       __free_pages(rx_buffer->page,
>> +                                    iavf_rx_pg_order(rx_ring));
>> +                       rx_buffer->page = NULL;
>>                         break;
>>                 }
> 
> This code was undoing the iavf_get_rx_buffer decrement of pagecnt_bias
> and then bailing since we have halted forward progress due to an skb
> allocation failure. As such we should just be removing the if
> statement and the increment of pagecnt_bias.
> 
>>
>> -               iavf_put_rx_buffer(rx_ring, rx_buffer);
>> +               rx_buffer->page = NULL;
>>                 cleaned_count++;
>>
>>                 if (iavf_is_non_eop(rx_ring, rx_desc, skb))
> 
> If iavf_put_rx_buffer just does the unmap and assignment of NULL then
> it could just be left here as is.

I guess those two are tied with the one above.

[...]

Thanks,
Olek