linux-kernel - Re: [net-next PATCH v4 7/7] net: ravb: Allocate RX buffers via page pool

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a59baa2d-0f00-18b9-bdea-0206b7a93f52@omp.ru>
Date: Fri, 31 May 2024 20:25:33 +0300
From: Sergey Shtylyov <s.shtylyov@....ru>
To: Paul Barker <paul.barker.ct@...renesas.com>, "David S. Miller"
	<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
	<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Niklas Söderlund <niklas.soderlund+renesas@...natech.se>
CC: Biju Das <biju.das.jz@...renesas.com>, Claudiu Beznea
	<claudiu.beznea.uj@...renesas.com>, Yoshihiro Shimoda
	<yoshihiro.shimoda.uh@...esas.com>, <netdev@...r.kernel.org>,
	<linux-renesas-soc@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [net-next PATCH v4 7/7] net: ravb: Allocate RX buffers via page
 pool

On 5/30/24 12:21 PM, Paul Barker wrote:
[...]

>>> This patch makes multiple changes that can't be separated:
>>>
>>>   1) Allocate plain RX buffers via a page pool instead of allocating
>>>      SKBs, then use build_skb() when a packet is received.
>>>   2) For GbEth IP, reduce the RX buffer size to 2kB.
>>>   3) For GbEth IP, merge packets which span more than one RX descriptor
>>>      as SKB fragments instead of copying data.
>>>
>>> Implementing (1) without (2) would require the use of an order-1 page
>>> pool (instead of an order-0 page pool split into page fragments) for
>>> GbEth.
>>>
>>> Implementing (2) without (3) would leave us no space to re-assemble
>>> packets which span more than one RX descriptor.
>>>
>>> Implementing (3) without (1) would not be possible as the network stack
>>> expects to use put_page() or page_pool_put_page() to free SKB fragments
>>> after an SKB is consumed.
>>>
>>> RX checksum offload support is adjusted to handle both linear and
>>> nonlinear (fragmented) packets.
>>>
>>> This patch gives the following improvements during testing with iperf3.
>>>
>>>   * RZ/G2L:
>>>     * TCP RX: same bandwidth at -43% CPU load (70% -> 40%)
>>>     * UDP RX: same bandwidth at -17% CPU load (88% -> 74%)
>>>
>>>   * RZ/G2UL:
>>>     * TCP RX: +30% bandwidth (726Mbps -> 941Mbps)
>>>     * UDP RX: +417% bandwidth (108Mbps -> 558Mbps)
>>>
>>>   * RZ/G3S:
>>>     * TCP RX: +64% bandwidth (562Mbps -> 920Mbps)
>>>     * UDP RX: +420% bandwidth (90Mbps -> 468Mbps)
>>>
>>>   * RZ/Five:
>>>     * TCP RX: +217% bandwidth (145Mbps -> 459Mbps)
>>>     * UDP RX: +470% bandwidth (20Mbps -> 114Mbps)
>>>
>>> There is no significant impact on bandwidth or CPU load in testing on
>>> RZ/G2H or R-Car M3N.
>>>
>>> Signed-off-by: Paul Barker <paul.barker.ct@...renesas.com>
[...]

>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>>> index dd92f074881a..bb7f7d44be6e 100644
>>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
[...]
>>> +	return 0;
>>> +}
>>> +
>>>  static u32
>>>  ravb_rx_ring_refill(struct net_device *ndev, int q, u32 count, gfp_t gfp_mask)
>>>  {
>>>  	struct ravb_private *priv = netdev_priv(ndev);
>>> -	const struct ravb_hw_info *info = priv->info;
>>>  	struct ravb_rx_desc *rx_desc;
>>> -	dma_addr_t dma_addr;
>>>  	u32 i, entry;
>>>  
>>>  	for (i = 0; i < count; i++) {
>>>  		entry = (priv->dirty_rx[q] + i) % priv->num_rx_ring[q];
>>>  		rx_desc = ravb_rx_get_desc(priv, q, entry);
>>> -		rx_desc->ds_cc = cpu_to_le16(info->rx_max_desc_use);
>>>  
>>> -		if (!priv->rx_skb[q][entry]) {
>>> -			priv->rx_skb[q][entry] = ravb_alloc_skb(ndev, info, gfp_mask);
>>> -			if (!priv->rx_skb[q][entry])
>>> +		if (!priv->rx_buffers[q][entry].page) {
>>> +			if (unlikely(ravb_alloc_rx_buffer(ndev, q, entry,
>>
>>    Well, IIRC Greg KH is against using unlikely() unless you have actually
>> instrumented the code and this gives an improvement... have you? :-)
> 
> My understanding was that we should use unlikely() for error checking in
> hot code paths where we want the "good" path to be optimised. I can drop
> this if I'm wrong though.

   OK, keep it... :-)

[...]
>>> @@ -865,7 +894,16 @@ static int ravb_rx_gbeth(struct net_device *ndev, int budget, int q)
>>>  				stats->rx_bytes += skb->len;
>>>  				napi_gro_receive(&priv->napi[q], skb);
>>>  				rx_packets++;
>>> +
>>> +				/* Clear rx_1st_skb so that it will only be
>>> +				 * non-NULL when valid.
>>> +				 */
>>> +				if (die_dt == DT_FEND)
>>> +					priv->rx_1st_skb = NULL;
>>
>>    Hm, can't we do this under *case* DT_FEND above?
> 
> It makes more logical sense to me to do this as the last step, but I
> guess it's a little more optimal to do it earlier. I'll move it.

   Looking at it once more, we can't... unless I'm missing s/th. :-)

> Thanks,

MBR, Sergey