[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-KBc=1SvpLET_NKjdaCTUP4r6P9hRU8QteBkw6W3qeP_A@mail.gmail.com>
Date: Fri, 29 Jan 2021 17:11:20 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Sven Van Asbroeck <thesven73@...il.com>
Cc: Bryan Whitehead <bryan.whitehead@...rochip.com>,
UNGLinuxDriver@...rochip.com, David S Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>,
Alexey Denisov <rtgbnm@...il.com>,
Sergej Bauer <sbauer@...ckbox.su>,
Tim Harvey <tharvey@...eworks.com>,
Anders Rønningen <anders@...ningen.priv.no>,
Network Development <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
On Fri, Jan 29, 2021 at 2:56 PM Sven Van Asbroeck <thesven73@...il.com> wrote:
>
> From: Sven Van Asbroeck <thesven73@...il.com>
>
> Multi-buffer packets enable us to use rx ring buffers smaller than
> the mtu. This will allow us to change the mtu on-the-fly, without
> having to stop the network interface in order to re-size the rx
> ring buffers.
>
> This is a big change touching a key driver function (process_packet),
> so care has been taken to test this extensively:
>
> Tests with debug logging enabled (add #define DEBUG).
>
> 1. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
> Ping to chip, verify correct packet size is sent to OS.
> Ping large packets to chip (ping -s 1400), verify correct
> packet size is sent to OS.
> Ping using packets around the buffer size, verify number of
> buffers is changing, verify correct packet size is sent
> to OS:
> $ ping -s 472
> $ ping -s 473
> $ ping -s 992
> $ ping -s 993
> Verify that each packet is followed by extension processing.
>
> 2. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
> Run iperf3 -s on chip, verify that packets come in 3 buffers
> at a time.
> Verify that packet size is equal to mtu.
> Verify that each packet is followed by extension processing.
>
> 3. Set chip and host mtu to 2000.
> Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
> Run iperf3 -s on chip, verify that packets come in 4 buffers
> at a time.
> Verify that packet size is equal to mtu.
> Verify that each packet is followed by extension processing.
>
> Tests with debug logging DISabled (remove #define DEBUG).
>
> 4. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
> Run iperf3 -s on chip, note sustained rx speed.
> Set chip and host mtu to 2000, so mtu takes 4 buffers.
> Run iperf3 -s on chip, note sustained rx speed.
> Verify no packets are dropped in both cases.
>
> Tests with DEBUG_KMEMLEAK on:
> $ mount -t debugfs nodev /sys/kernel/debug/
> $ echo scan > /sys/kernel/debug/kmemleak
>
> 5. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
> Run the following tests concurrently for at least one hour:
> - iperf3 -s on chip
> - ping -> chip
> Monitor reported memory leaks.
>
> 6. Set chip and host mtu to 2000.
> Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
> Run the following tests concurrently for at least one hour:
> - iperf3 -s on chip
> - ping -> chip
> Monitor reported memory leaks.
>
> 7. Simulate low-memory in lan743x_rx_allocate_skb(): fail every
> 100 allocations.
> Repeat (5) and (6).
> Monitor reported memory leaks.
>
> 8. Simulate low-memory in lan743x_rx_allocate_skb(): fail 10
> allocations in a row in every 100.
> Repeat (5) and (6).
> Monitor reported memory leaks.
>
> 9. Simulate low-memory in lan743x_rx_trim_skb(): fail 1 allocation
> in every 100.
> Repeat (5) and (6).
> Monitor reported memory leaks.
>
> Signed-off-by: Sven Van Asbroeck <thesven73@...il.com>
> ---
>
> Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1
>
> To: Bryan Whitehead <bryan.whitehead@...rochip.com>
> To: UNGLinuxDriver@...rochip.com
> To: "David S. Miller" <davem@...emloft.net>
> To: Jakub Kicinski <kuba@...nel.org>
> Cc: Andrew Lunn <andrew@...n.ch>
> Cc: Alexey Denisov <rtgbnm@...il.com>
> Cc: Sergej Bauer <sbauer@...ckbox.su>
> Cc: Tim Harvey <tharvey@...eworks.com>
> Cc: Anders Rønningen <anders@...ningen.priv.no>
> Cc: netdev@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org (open list)
>
> drivers/net/ethernet/microchip/lan743x_main.c | 321 ++++++++----------
> drivers/net/ethernet/microchip/lan743x_main.h | 2 +
> 2 files changed, 143 insertions(+), 180 deletions(-)
> +static struct sk_buff *
> +lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length)
> +{
> + if (skb_linearize(skb)) {
Is this needed? That will be quite expensive
> + dev_kfree_skb_irq(skb);
> + return NULL;
> + }
> + frame_length = max_t(int, 0, frame_length - RX_HEAD_PADDING - 2);
> + if (skb->len > frame_length) {
> + skb->tail -= skb->len - frame_length;
> + skb->len = frame_length;
> + }
> + return skb;
> +}
> +
> static int lan743x_rx_process_packet(struct lan743x_rx *rx)
> {
> - struct skb_shared_hwtstamps *hwtstamps = NULL;
> + struct lan743x_rx_descriptor *descriptor, *desc_ext;
> int result = RX_PROCESS_RESULT_NOTHING_TO_DO;
> int current_head_index = le32_to_cpu(*rx->head_cpu_ptr);
> struct lan743x_rx_buffer_info *buffer_info;
> - struct lan743x_rx_descriptor *descriptor;
> + struct skb_shared_hwtstamps *hwtstamps;
> + int frame_length, buffer_length;
> + struct sk_buff *skb;
> int extension_index = -1;
> - int first_index = -1;
> - int last_index = -1;
> + bool is_last, is_first;
>
> if (current_head_index < 0 || current_head_index >= rx->ring_size)
> goto done;
> @@ -2068,170 +2075,126 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
> if (rx->last_head < 0 || rx->last_head >= rx->ring_size)
> goto done;
>
> - if (rx->last_head != current_head_index) {
> - descriptor = &rx->ring_cpu_ptr[rx->last_head];
> - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> - goto done;
> + if (rx->last_head == current_head_index)
> + goto done;
Is it possible to avoid the large indentation change, or else do that
in a separate patch? It makes it harder to follow the functional
change.
>
> - if (!(le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_))
> - goto done;
> + descriptor = &rx->ring_cpu_ptr[rx->last_head];
> + if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> + goto done;
> + buffer_info = &rx->buffer_info[rx->last_head];
>
> - first_index = rx->last_head;
> - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
> - last_index = rx->last_head;
> - } else {
> - int index;
> -
> - index = lan743x_rx_next_index(rx, first_index);
> - while (index != current_head_index) {
> - descriptor = &rx->ring_cpu_ptr[index];
> - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> - goto done;
> -
> - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
> - last_index = index;
> - break;
> - }
> - index = lan743x_rx_next_index(rx, index);
> - }
> - }
> - if (last_index >= 0) {
> - descriptor = &rx->ring_cpu_ptr[last_index];
> - if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
> - /* extension is expected to follow */
> - int index = lan743x_rx_next_index(rx,
> - last_index);
> - if (index != current_head_index) {
> - descriptor = &rx->ring_cpu_ptr[index];
> - if (le32_to_cpu(descriptor->data0) &
> - RX_DESC_DATA0_OWN_) {
> - goto done;
> - }
> - if (le32_to_cpu(descriptor->data0) &
> - RX_DESC_DATA0_EXT_) {
> - extension_index = index;
> - } else {
> - goto done;
> - }
> - } else {
> - /* extension is not yet available */
> - /* prevent processing of this packet */
> - first_index = -1;
> - last_index = -1;
> - }
> - }
> - }
> + is_last = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_;
> + is_first = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_;
> +
> + if (is_last && le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
> + /* extension is expected to follow */
> + int index = lan743x_rx_next_index(rx, rx->last_head);
> +
> + if (index == current_head_index)
> + /* extension not yet available */
> + goto done;
> + desc_ext = &rx->ring_cpu_ptr[index];
> + if (le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_OWN_)
> + /* extension not yet available */
> + goto done;
> + if (!(le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_EXT_))
> + goto move_forward;
> + extension_index = index;
> }
> - if (first_index >= 0 && last_index >= 0) {
> - int real_last_index = last_index;
> - struct sk_buff *skb = NULL;
> - u32 ts_sec = 0;
> - u32 ts_nsec = 0;
> -
> - /* packet is available */
> - if (first_index == last_index) {
> - /* single buffer packet */
> - struct sk_buff *new_skb = NULL;
> - int packet_length;
> -
> - new_skb = lan743x_rx_allocate_skb(rx);
> - if (!new_skb) {
> - /* failed to allocate next skb.
> - * Memory is very low.
> - * Drop this packet and reuse buffer.
> - */
> - lan743x_rx_reuse_ring_element(rx, first_index);
> - goto process_extension;
> - }
>
> - buffer_info = &rx->buffer_info[first_index];
> - skb = buffer_info->skb;
> - descriptor = &rx->ring_cpu_ptr[first_index];
> -
> - /* unmap from dma */
> - packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> - (descriptor->data0);
> - if (buffer_info->dma_ptr) {
> - dma_sync_single_for_cpu(&rx->adapter->pdev->dev,
> - buffer_info->dma_ptr,
> - packet_length,
> - DMA_FROM_DEVICE);
> - dma_unmap_single_attrs(&rx->adapter->pdev->dev,
> - buffer_info->dma_ptr,
> - buffer_info->buffer_length,
> - DMA_FROM_DEVICE,
> - DMA_ATTR_SKIP_CPU_SYNC);
> - buffer_info->dma_ptr = 0;
> - buffer_info->buffer_length = 0;
> - }
> - buffer_info->skb = NULL;
> - packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> - (le32_to_cpu(descriptor->data0));
> - skb_put(skb, packet_length - 4);
> - skb->protocol = eth_type_trans(skb,
> - rx->adapter->netdev);
> - lan743x_rx_init_ring_element(rx, first_index, new_skb);
> - } else {
> - int index = first_index;
> + /* Only the last buffer in a multi-buffer frame contains the total frame
> + * length. All other buffers have a zero frame length. The chip
> + * occasionally sends more buffers than strictly required to reach the
> + * total frame length.
> + * Handle this by adding all buffers to the skb in their entirety.
> + * Once the real frame length is known, trim the skb.
> + */
> + frame_length =
> + RX_DESC_DATA0_FRAME_LENGTH_GET_(le32_to_cpu(descriptor->data0));
> + buffer_length = buffer_info->buffer_length;
>
> - /* multi buffer packet not supported */
> - /* this should not happen since buffers are allocated
> - * to be at least the mtu size configured in the mac.
> - */
> + netdev_dbg(rx->adapter->netdev, "%s%schunk: %d/%d",
> + is_first ? "first " : " ",
> + is_last ? "last " : " ",
> + frame_length, buffer_length);
>
> - /* clean up buffers */
> - if (first_index <= last_index) {
> - while ((index >= first_index) &&
> - (index <= last_index)) {
> - lan743x_rx_reuse_ring_element(rx,
> - index);
> - index = lan743x_rx_next_index(rx,
> - index);
> - }
> - } else {
> - while ((index >= first_index) ||
> - (index <= last_index)) {
> - lan743x_rx_reuse_ring_element(rx,
> - index);
> - index = lan743x_rx_next_index(rx,
> - index);
> - }
> - }
> - }
> + /* unmap from dma */
> + if (buffer_info->dma_ptr) {
> + dma_unmap_single(&rx->adapter->pdev->dev,
> + buffer_info->dma_ptr,
> + buffer_info->buffer_length,
> + DMA_FROM_DEVICE);
> + buffer_info->dma_ptr = 0;
> + buffer_info->buffer_length = 0;
> + }
> + skb = buffer_info->skb;
>
> -process_extension:
> - if (extension_index >= 0) {
> - descriptor = &rx->ring_cpu_ptr[extension_index];
> - buffer_info = &rx->buffer_info[extension_index];
> -
> - ts_sec = le32_to_cpu(descriptor->data1);
> - ts_nsec = (le32_to_cpu(descriptor->data2) &
> - RX_DESC_DATA2_TS_NS_MASK_);
> - lan743x_rx_reuse_ring_element(rx, extension_index);
> - real_last_index = extension_index;
> - }
> + /* allocate new skb and map to dma */
> + if (lan743x_rx_init_ring_element(rx, rx->last_head)) {
> + /* failed to allocate next skb.
> + * Memory is very low.
> + * Drop this packet and reuse buffer.
> + */
> + lan743x_rx_reuse_ring_element(rx, rx->last_head);
> + goto process_extension;
> + }
> +
> + /* add buffers to skb via skb->frag_list */
> + if (is_first) {
> + skb_reserve(skb, RX_HEAD_PADDING);
> + skb_put(skb, buffer_length - RX_HEAD_PADDING);
> + if (rx->skb_head)
> + dev_kfree_skb_irq(rx->skb_head);
> + rx->skb_head = skb;
> + } else if (rx->skb_head) {
> + skb_put(skb, buffer_length);
> + if (skb_shinfo(rx->skb_head)->frag_list)
> + rx->skb_tail->next = skb;
> + else
> + skb_shinfo(rx->skb_head)->frag_list = skb;
Instead of chaining skbs into frag_list, you could perhaps delay skb
alloc until after reception, allocate buffers stand-alone, and link
them into the skb as skb_frags? That might avoid a few skb alloc +
frees. Though a bit change, not sure how feasible.
> + rx->skb_tail = skb;
> + rx->skb_head->len += skb->len;
> + rx->skb_head->data_len += skb->len;
> + rx->skb_head->truesize += skb->truesize;
> + } else {
> + rx->skb_head = skb;
> + }
>
> - if (!skb) {
> - result = RX_PROCESS_RESULT_PACKET_DROPPED;
> - goto move_forward;
> +process_extension:
> + if (extension_index >= 0) {
> + u32 ts_sec;
> + u32 ts_nsec;
> +
> + ts_sec = le32_to_cpu(desc_ext->data1);
> + ts_nsec = (le32_to_cpu(desc_ext->data2) &
> + RX_DESC_DATA2_TS_NS_MASK_);
> + if (rx->skb_head) {
> + hwtstamps = skb_hwtstamps(rx->skb_head);
> + if (hwtstamps)
This is always true.
You can just call skb_hwtstamps(skb)->hwtstamp = ktime_set(ts_sec, ts_nsec);
Though I see that this is existing code just moved due to
aforementioned indentation change.
Powered by blists - more mailing lists