[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ecpczt8q.fsf@redhat.com>
Date: Tue, 02 Dec 2025 18:30:29 +0100
From: Paolo Valerio <pvalerio@...hat.com>
To: Théo Lebrun <theo.lebrun@...tlin.com>,
netdev@...r.kernel.org
Cc: Nicolas Ferre <nicolas.ferre@...rochip.com>, Claudiu Beznea
<claudiu.beznea@...on.dev>, Andrew Lunn <andrew+netdev@...n.ch>, "David
S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub
Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Lorenzo
Bianconi <lorenzo@...nel.org>, Grégory Clement
<gregory.clement@...tlin.com>, Benoît Monin
<benoit.monin@...tlin.com>, Thomas Petazzoni
<thomas.petazzoni@...tlin.com>
Subject: Re: [PATCH RFC net-next 1/6] cadence: macb/gem: Add page pool support
On 27 Nov 2025 at 02:21:52 PM, Théo Lebrun <theo.lebrun@...tlin.com> wrote:
> Hello Paolo,
>
> On Wed Nov 19, 2025 at 2:53 PM CET, Paolo Valerio wrote:
>> Subject: [PATCH RFC net-next 1/6] cadence: macb/gem: Add page pool support
>
> `git log --oneline drivers/net/ethernet/cadence/` tells us all commits
> in this series should be prefixed with "net: macb: ".
>
ack
>> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
>> index 87414a2ddf6e..dcf768bd1bc1 100644
>> --- a/drivers/net/ethernet/cadence/macb.h
>> +++ b/drivers/net/ethernet/cadence/macb.h
>> @@ -14,6 +14,8 @@
>> #include <linux/interrupt.h>
>> #include <linux/phy/phy.h>
>> #include <linux/workqueue.h>
>> +#include <net/page_pool/helpers.h>
>> +#include <net/xdp.h>
>>
>> #define MACB_GREGS_NBR 16
>> #define MACB_GREGS_VERSION 2
>> @@ -957,6 +959,10 @@ struct macb_dma_desc_ptp {
>> /* Scaled PPM fraction */
>> #define PPM_FRACTION 16
>>
>> +/* The buf includes headroom compatible with both skb and xdpf */
>> +#define MACB_PP_HEADROOM XDP_PACKET_HEADROOM
>> +#define MACB_PP_MAX_BUF_SIZE(num) (((num) * PAGE_SIZE) - MACB_PP_HEADROOM)
>
> From my previous review, you know I think MACB_PP_MAX_BUF_SIZE() should
> disappear.
>
> I also don't see the point of MACB_PP_HEADROOM. Maybe if it was
> max(XDP_PACKET_HEADROOM, NET_SKB_PAD) it would be more useful, but that
> isn't useful anyway. Can we drop it and use XDP_PACKET_HEADROOM directly
> in the code?
>
sure
>> /* struct macb_tx_skb - data about an skb which is being transmitted
>> * @skb: skb currently being transmitted, only set for the last buffer
>> * of the frame
>> @@ -1262,10 +1268,11 @@ struct macb_queue {
>> unsigned int rx_tail;
>> unsigned int rx_prepared_head;
>> struct macb_dma_desc *rx_ring;
>> - struct sk_buff **rx_skbuff;
>> + void **rx_buff;
>
> It would help review if the s/rx_skbuff/rx_buff/ renaming was done in a
> separate commit with a commit message being "this only renames X and
> implies no functional changes".
>
ack, will do
>> void *rx_buffers;
>> struct napi_struct napi_rx;
>> struct queue_stats stats;
>> + struct page_pool *page_pool;
>> };
>>
>> struct ethtool_rx_fs_item {
>> @@ -1289,6 +1296,7 @@ struct macb {
>> struct macb_dma_desc *rx_ring_tieoff;
>> dma_addr_t rx_ring_tieoff_dma;
>> size_t rx_buffer_size;
>> + u16 rx_offset;
>
> u16 makes me worried that we might do mistakes. For example the
> following propagates the u16 type.
>
> bp->rx_offset + data - page_address(page)
>
> We can spare the additional 6 bytes and turn it into a size_t. It'll
> fall in holes anyway, at least it does for my target according to pahole.
>
will do
>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
>> index e461f5072884..985c81913ba6 100644
>> --- a/drivers/net/ethernet/cadence/macb_main.c
>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>> @@ -1250,11 +1250,28 @@ static int macb_tx_complete(struct macb_queue *queue, int budget)
>> return packets;
>> }
>>
>> -static void gem_rx_refill(struct macb_queue *queue)
>> +static void *gem_page_pool_get_buff(struct page_pool *pool,
>> + dma_addr_t *dma_addr, gfp_t gfp_mask)
>> +{
>> + struct page *page;
>> +
>> + if (!pool)
>> + return NULL;
>> +
>> + page = page_pool_alloc_pages(pool, gfp_mask | __GFP_NOWARN);
>> + if (!page)
>> + return NULL;
>> +
>> + *dma_addr = page_pool_get_dma_addr(page) + MACB_PP_HEADROOM;
>> +
>> + return page_address(page);
>> +}
>
> Do we need a separate function called from only one location? Or we
> could change its name to highlight it allocates and does not just "get"
> a buffer.
>
I guess it's fine to open-code this.
>> @@ -1267,25 +1284,17 @@ static void gem_rx_refill(struct macb_queue *queue)
>>
>> desc = macb_rx_desc(queue, entry);
>>
>> - if (!queue->rx_skbuff[entry]) {
>> + if (!queue->rx_buff[entry]) {
>> /* allocate sk_buff for this free entry in ring */
>> - skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
>> - if (unlikely(!skb)) {
>> + data = gem_page_pool_get_buff(queue->page_pool, &paddr,
>> + napi ? GFP_ATOMIC : GFP_KERNEL);
>
> I don't get why the gfp flags computation is spread across
> gem_page_pool_get_buff() and gem_rx_refill().
>
Not sure I get the point here.
This will be open-coded, so atomic vs non-atomic flag will not be passed
to gem_page_pool_get_buff() anymore after the change.
Not sure this answer your question.
>> @@ -1349,12 +1344,16 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>> int budget)
>> {
>> struct macb *bp = queue->bp;
>> + int buffer_size;
>> unsigned int len;
>> unsigned int entry;
>> + void *data;
>> struct sk_buff *skb;
>> struct macb_dma_desc *desc;
>> int count = 0;
>>
>> + buffer_size = DIV_ROUND_UP(bp->rx_buffer_size, PAGE_SIZE) * PAGE_SIZE;
>
> This looks like
>
> buffer_size = ALIGN(bp->rx_buffer_size, PAGE_SIZE);
>
> no? Anyway I think it should be dropped. It does get dropped next patch
> in this RFC.
>
will proceed squashing the two patches
>> @@ -1387,24 +1386,49 @@ static int gem_rx(struct macb_queue *queue, struct napi_struct *napi,
>> queue->stats.rx_dropped++;
>> break;
>> }
>> - skb = queue->rx_skbuff[entry];
>> - if (unlikely(!skb)) {
>> + data = queue->rx_buff[entry];
>> + if (unlikely(!data)) {
>> netdev_err(bp->dev,
>> "inconsistent Rx descriptor chain\n");
>> bp->dev->stats.rx_dropped++;
>> queue->stats.rx_dropped++;
>> break;
>> }
>> +
>> + skb = napi_build_skb(data, buffer_size);
>> + if (unlikely(!skb)) {
>> + netdev_err(bp->dev,
>> + "Unable to allocate sk_buff\n");
>> + page_pool_put_full_page(queue->page_pool,
>> + virt_to_head_page(data),
>> + false);
>> + break;
>
> We should `queue->rx_skbuff[entry] = NULL` here no?
> We free a page and keep a pointer to it.
>
This will be squashed, but the point is still valid as it's the same as
the other patch
>> + }
>> +
>> + /* Properly align Ethernet header.
>> + *
>> + * Hardware can add dummy bytes if asked using the RBOF
>> + * field inside the NCFGR register. That feature isn't
>> + * available if hardware is RSC capable.
>> + *
>> + * We cannot fallback to doing the 2-byte shift before
>> + * DMA mapping because the address field does not allow
>> + * setting the low 2/3 bits.
>> + * It is 3 bits if HW_DMA_CAP_PTP, else 2 bits.
>> + */
>> + skb_reserve(skb, bp->rx_offset);
>> + skb_mark_for_recycle(skb);
>
> I have a platform with RSC support and NET_IP_ALIGN=2. What is yours
> like? It'd be nice if we can test different cases of this RBOF topic.
>
my caps:
macb 1f00100000.ethernet: Cadence caps 0x00548061
Bit RSC looks unset and NET_IP_ALIGN is 0 in my case.
>> /* now everything is ready for receiving packet */
>> - queue->rx_skbuff[entry] = NULL;
>> + queue->rx_buff[entry] = NULL;
>> len = ctrl & bp->rx_frm_len_mask;
>>
>> netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
>>
>> + dma_sync_single_for_cpu(&bp->pdev->dev,
>> + addr, len,
>> + page_pool_get_dma_dir(queue->page_pool));
>
> Any reason for the call to dma_sync_single_for_cpu(), we could hardcode
> it to DMA_FROM_DEVICE no?
>
I saw in the other reply you found the answer
>> @@ -2477,13 +2497,13 @@ static int gem_alloc_rx_buffers(struct macb *bp)
>>
>> for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
>> size = bp->rx_ring_size * sizeof(struct sk_buff *);
>
> sizeof is called with the wrong type. Not that it matters because
> pointers are pointers, but we can take this opportunity to move to
>
> sizeof(*queue->rx_buff)
>
definitely
>> - queue->rx_skbuff = kzalloc(size, GFP_KERNEL);
>> - if (!queue->rx_skbuff)
>> + queue->rx_buff = kzalloc(size, GFP_KERNEL);
>> + if (!queue->rx_buff)
>> return -ENOMEM;
>> else
>> netdev_dbg(bp->dev,
>> - "Allocated %d RX struct sk_buff entries at %p\n",
>> - bp->rx_ring_size, queue->rx_skbuff);
>> + "Allocated %d RX buff entries at %p\n",
>> + bp->rx_ring_size, queue->rx_buff);
>
> Opportunity to deindent this block?
>
ack
> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
Powered by blists - more mailing lists