lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-KBc=1SvpLET_NKjdaCTUP4r6P9hRU8QteBkw6W3qeP_A@mail.gmail.com>
Date:   Fri, 29 Jan 2021 17:11:20 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Sven Van Asbroeck <thesven73@...il.com>
Cc:     Bryan Whitehead <bryan.whitehead@...rochip.com>,
        UNGLinuxDriver@...rochip.com, David S Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>,
        Alexey Denisov <rtgbnm@...il.com>,
        Sergej Bauer <sbauer@...ckbox.su>,
        Tim Harvey <tharvey@...eworks.com>,
        Anders Rønningen <anders@...ningen.priv.no>,
        Network Development <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets

On Fri, Jan 29, 2021 at 2:56 PM Sven Van Asbroeck <thesven73@...il.com> wrote:
>
> From: Sven Van Asbroeck <thesven73@...il.com>
>
> Multi-buffer packets enable us to use rx ring buffers smaller than
> the mtu. This will allow us to change the mtu on-the-fly, without
> having to stop the network interface in order to re-size the rx
> ring buffers.
>
> This is a big change touching a key driver function (process_packet),
> so care has been taken to test this extensively:
>
> Tests with debug logging enabled (add #define DEBUG).
>
> 1. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Ping to chip, verify correct packet size is sent to OS.
>    Ping large packets to chip (ping -s 1400), verify correct
>      packet size is sent to OS.
>    Ping using packets around the buffer size, verify number of
>      buffers is changing, verify correct packet size is sent
>      to OS:
>      $ ping -s 472
>      $ ping -s 473
>      $ ping -s 992
>      $ ping -s 993
>    Verify that each packet is followed by extension processing.
>
> 2. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Run iperf3 -s on chip, verify that packets come in 3 buffers
>      at a time.
>    Verify that packet size is equal to mtu.
>    Verify that each packet is followed by extension processing.
>
> 3. Set chip and host mtu to 2000.
>    Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
>    Run iperf3 -s on chip, verify that packets come in 4 buffers
>      at a time.
>    Verify that packet size is equal to mtu.
>    Verify that each packet is followed by extension processing.
>
> Tests with debug logging DISabled (remove #define DEBUG).
>
> 4. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Run iperf3 -s on chip, note sustained rx speed.
>    Set chip and host mtu to 2000, so mtu takes 4 buffers.
>    Run iperf3 -s on chip, note sustained rx speed.
>    Verify no packets are dropped in both cases.
>
> Tests with DEBUG_KMEMLEAK on:
>  $ mount -t debugfs nodev /sys/kernel/debug/
>  $ echo scan > /sys/kernel/debug/kmemleak
>
> 5. Limit rx buffer size to 500, so mtu (1500) takes 3 buffers.
>    Run the following tests concurrently for at least one hour:
>    - iperf3 -s on chip
>    - ping -> chip
>    Monitor reported memory leaks.
>
> 6. Set chip and host mtu to 2000.
>    Limit rx buffer size to 500, so mtu (2000) takes 4 buffers.
>    Run the following tests concurrently for at least one hour:
>    - iperf3 -s on chip
>    - ping -> chip
>    Monitor reported memory leaks.
>
> 7. Simulate low-memory in lan743x_rx_allocate_skb(): fail every
>      100 allocations.
>    Repeat (5) and (6).
>    Monitor reported memory leaks.
>
> 8. Simulate  low-memory in lan743x_rx_allocate_skb(): fail 10
>      allocations in a row in every 100.
>    Repeat (5) and (6).
>    Monitor reported memory leaks.
>
> 9. Simulate  low-memory in lan743x_rx_trim_skb(): fail 1 allocation
>      in every 100.
>    Repeat (5) and (6).
>    Monitor reported memory leaks.
>
> Signed-off-by: Sven Van Asbroeck <thesven73@...il.com>
> ---
>
> Tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git # 46eb3c108fe1
>
> To: Bryan Whitehead <bryan.whitehead@...rochip.com>
> To: UNGLinuxDriver@...rochip.com
> To: "David S. Miller" <davem@...emloft.net>
> To: Jakub Kicinski <kuba@...nel.org>
> Cc: Andrew Lunn <andrew@...n.ch>
> Cc: Alexey Denisov <rtgbnm@...il.com>
> Cc: Sergej Bauer <sbauer@...ckbox.su>
> Cc: Tim Harvey <tharvey@...eworks.com>
> Cc: Anders Rønningen <anders@...ningen.priv.no>
> Cc: netdev@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org (open list)
>
>  drivers/net/ethernet/microchip/lan743x_main.c | 321 ++++++++----------
>  drivers/net/ethernet/microchip/lan743x_main.h |   2 +
>  2 files changed, 143 insertions(+), 180 deletions(-)


> +static struct sk_buff *
> +lan743x_rx_trim_skb(struct sk_buff *skb, int frame_length)
> +{
> +       if (skb_linearize(skb)) {

Is this needed? That will be quite expensive

> +               dev_kfree_skb_irq(skb);
> +               return NULL;
> +       }
> +       frame_length = max_t(int, 0, frame_length - RX_HEAD_PADDING - 2);
> +       if (skb->len > frame_length) {
> +               skb->tail -= skb->len - frame_length;
> +               skb->len = frame_length;
> +       }
> +       return skb;
> +}
> +
>  static int lan743x_rx_process_packet(struct lan743x_rx *rx)
>  {
> -       struct skb_shared_hwtstamps *hwtstamps = NULL;
> +       struct lan743x_rx_descriptor *descriptor, *desc_ext;
>         int result = RX_PROCESS_RESULT_NOTHING_TO_DO;
>         int current_head_index = le32_to_cpu(*rx->head_cpu_ptr);
>         struct lan743x_rx_buffer_info *buffer_info;
> -       struct lan743x_rx_descriptor *descriptor;
> +       struct skb_shared_hwtstamps *hwtstamps;
> +       int frame_length, buffer_length;
> +       struct sk_buff *skb;
>         int extension_index = -1;
> -       int first_index = -1;
> -       int last_index = -1;
> +       bool is_last, is_first;
>
>         if (current_head_index < 0 || current_head_index >= rx->ring_size)
>                 goto done;
> @@ -2068,170 +2075,126 @@ static int lan743x_rx_process_packet(struct lan743x_rx *rx)
>         if (rx->last_head < 0 || rx->last_head >= rx->ring_size)
>                 goto done;
>
> -       if (rx->last_head != current_head_index) {
> -               descriptor = &rx->ring_cpu_ptr[rx->last_head];
> -               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> -                       goto done;
> +       if (rx->last_head == current_head_index)
> +               goto done;

Is it possible to avoid the large indentation change, or else do that
in a separate patch? It makes it harder to follow the functional
change.

>
> -               if (!(le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_))
> -                       goto done;
> +       descriptor = &rx->ring_cpu_ptr[rx->last_head];
> +       if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> +               goto done;
> +       buffer_info = &rx->buffer_info[rx->last_head];
>
> -               first_index = rx->last_head;
> -               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
> -                       last_index = rx->last_head;
> -               } else {
> -                       int index;
> -
> -                       index = lan743x_rx_next_index(rx, first_index);
> -                       while (index != current_head_index) {
> -                               descriptor = &rx->ring_cpu_ptr[index];
> -                               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_OWN_)
> -                                       goto done;
> -
> -                               if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_) {
> -                                       last_index = index;
> -                                       break;
> -                               }
> -                               index = lan743x_rx_next_index(rx, index);
> -                       }
> -               }
> -               if (last_index >= 0) {
> -                       descriptor = &rx->ring_cpu_ptr[last_index];
> -                       if (le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
> -                               /* extension is expected to follow */
> -                               int index = lan743x_rx_next_index(rx,
> -                                                                 last_index);
> -                               if (index != current_head_index) {
> -                                       descriptor = &rx->ring_cpu_ptr[index];
> -                                       if (le32_to_cpu(descriptor->data0) &
> -                                           RX_DESC_DATA0_OWN_) {
> -                                               goto done;
> -                                       }
> -                                       if (le32_to_cpu(descriptor->data0) &
> -                                           RX_DESC_DATA0_EXT_) {
> -                                               extension_index = index;
> -                                       } else {
> -                                               goto done;
> -                                       }
> -                               } else {
> -                                       /* extension is not yet available */
> -                                       /* prevent processing of this packet */
> -                                       first_index = -1;
> -                                       last_index = -1;
> -                               }
> -                       }
> -               }
> +       is_last = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_LS_;
> +       is_first = le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_FS_;
> +
> +       if (is_last && le32_to_cpu(descriptor->data0) & RX_DESC_DATA0_EXT_) {
> +               /* extension is expected to follow */
> +               int index = lan743x_rx_next_index(rx, rx->last_head);
> +
> +               if (index == current_head_index)
> +                       /* extension not yet available */
> +                       goto done;
> +               desc_ext = &rx->ring_cpu_ptr[index];
> +               if (le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_OWN_)
> +                       /* extension not yet available */
> +                       goto done;
> +               if (!(le32_to_cpu(desc_ext->data0) & RX_DESC_DATA0_EXT_))
> +                       goto move_forward;
> +               extension_index = index;
>         }
> -       if (first_index >= 0 && last_index >= 0) {
> -               int real_last_index = last_index;
> -               struct sk_buff *skb = NULL;
> -               u32 ts_sec = 0;
> -               u32 ts_nsec = 0;
> -
> -               /* packet is available */
> -               if (first_index == last_index) {
> -                       /* single buffer packet */
> -                       struct sk_buff *new_skb = NULL;
> -                       int packet_length;
> -
> -                       new_skb = lan743x_rx_allocate_skb(rx);
> -                       if (!new_skb) {
> -                               /* failed to allocate next skb.
> -                                * Memory is very low.
> -                                * Drop this packet and reuse buffer.
> -                                */
> -                               lan743x_rx_reuse_ring_element(rx, first_index);
> -                               goto process_extension;
> -                       }
>
> -                       buffer_info = &rx->buffer_info[first_index];
> -                       skb = buffer_info->skb;
> -                       descriptor = &rx->ring_cpu_ptr[first_index];
> -
> -                       /* unmap from dma */
> -                       packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> -                                       (descriptor->data0);
> -                       if (buffer_info->dma_ptr) {
> -                               dma_sync_single_for_cpu(&rx->adapter->pdev->dev,
> -                                                       buffer_info->dma_ptr,
> -                                                       packet_length,
> -                                                       DMA_FROM_DEVICE);
> -                               dma_unmap_single_attrs(&rx->adapter->pdev->dev,
> -                                                      buffer_info->dma_ptr,
> -                                                      buffer_info->buffer_length,
> -                                                      DMA_FROM_DEVICE,
> -                                                      DMA_ATTR_SKIP_CPU_SYNC);
> -                               buffer_info->dma_ptr = 0;
> -                               buffer_info->buffer_length = 0;
> -                       }
> -                       buffer_info->skb = NULL;
> -                       packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
> -                                       (le32_to_cpu(descriptor->data0));
> -                       skb_put(skb, packet_length - 4);
> -                       skb->protocol = eth_type_trans(skb,
> -                                                      rx->adapter->netdev);
> -                       lan743x_rx_init_ring_element(rx, first_index, new_skb);
> -               } else {
> -                       int index = first_index;
> +       /* Only the last buffer in a multi-buffer frame contains the total frame
> +        * length. All other buffers have a zero frame length. The chip
> +        * occasionally sends more buffers than strictly required to reach the
> +        * total frame length.
> +        * Handle this by adding all buffers to the skb in their entirety.
> +        * Once the real frame length is known, trim the skb.
> +        */
> +       frame_length =
> +               RX_DESC_DATA0_FRAME_LENGTH_GET_(le32_to_cpu(descriptor->data0));
> +       buffer_length = buffer_info->buffer_length;
>
> -                       /* multi buffer packet not supported */
> -                       /* this should not happen since buffers are allocated
> -                        * to be at least the mtu size configured in the mac.
> -                        */
> +       netdev_dbg(rx->adapter->netdev, "%s%schunk: %d/%d",
> +                  is_first ? "first " : "      ",
> +                  is_last  ? "last  " : "      ",
> +                  frame_length, buffer_length);
>
> -                       /* clean up buffers */
> -                       if (first_index <= last_index) {
> -                               while ((index >= first_index) &&
> -                                      (index <= last_index)) {
> -                                       lan743x_rx_reuse_ring_element(rx,
> -                                                                     index);
> -                                       index = lan743x_rx_next_index(rx,
> -                                                                     index);
> -                               }
> -                       } else {
> -                               while ((index >= first_index) ||
> -                                      (index <= last_index)) {
> -                                       lan743x_rx_reuse_ring_element(rx,
> -                                                                     index);
> -                                       index = lan743x_rx_next_index(rx,
> -                                                                     index);
> -                               }
> -                       }
> -               }
> +       /* unmap from dma */
> +       if (buffer_info->dma_ptr) {
> +               dma_unmap_single(&rx->adapter->pdev->dev,
> +                                buffer_info->dma_ptr,
> +                                buffer_info->buffer_length,
> +                                DMA_FROM_DEVICE);
> +               buffer_info->dma_ptr = 0;
> +               buffer_info->buffer_length = 0;
> +       }
> +       skb = buffer_info->skb;
>
> -process_extension:
> -               if (extension_index >= 0) {
> -                       descriptor = &rx->ring_cpu_ptr[extension_index];
> -                       buffer_info = &rx->buffer_info[extension_index];
> -
> -                       ts_sec = le32_to_cpu(descriptor->data1);
> -                       ts_nsec = (le32_to_cpu(descriptor->data2) &
> -                                 RX_DESC_DATA2_TS_NS_MASK_);
> -                       lan743x_rx_reuse_ring_element(rx, extension_index);
> -                       real_last_index = extension_index;
> -               }
> +       /* allocate new skb and map to dma */
> +       if (lan743x_rx_init_ring_element(rx, rx->last_head)) {
> +               /* failed to allocate next skb.
> +                * Memory is very low.
> +                * Drop this packet and reuse buffer.
> +                */
> +               lan743x_rx_reuse_ring_element(rx, rx->last_head);
> +               goto process_extension;
> +       }
> +
> +       /* add buffers to skb via skb->frag_list */
> +       if (is_first) {
> +               skb_reserve(skb, RX_HEAD_PADDING);
> +               skb_put(skb, buffer_length - RX_HEAD_PADDING);
> +               if (rx->skb_head)
> +                       dev_kfree_skb_irq(rx->skb_head);
> +               rx->skb_head = skb;
> +       } else if (rx->skb_head) {
> +               skb_put(skb, buffer_length);
> +               if (skb_shinfo(rx->skb_head)->frag_list)
> +                       rx->skb_tail->next = skb;
> +               else
> +                       skb_shinfo(rx->skb_head)->frag_list = skb;

Instead of chaining skbs into frag_list, you could perhaps delay skb
alloc until after reception, allocate buffers stand-alone, and link
them into the skb as skb_frags? That might avoid a few skb alloc +
frees. Though a bit change, not sure how feasible.

> +               rx->skb_tail = skb;
> +               rx->skb_head->len += skb->len;
> +               rx->skb_head->data_len += skb->len;
> +               rx->skb_head->truesize += skb->truesize;
> +       } else {
> +               rx->skb_head = skb;
> +       }
>
> -               if (!skb) {
> -                       result = RX_PROCESS_RESULT_PACKET_DROPPED;
> -                       goto move_forward;
> +process_extension:
> +       if (extension_index >= 0) {
> +               u32 ts_sec;
> +               u32 ts_nsec;
> +
> +               ts_sec = le32_to_cpu(desc_ext->data1);
> +               ts_nsec = (le32_to_cpu(desc_ext->data2) &
> +                         RX_DESC_DATA2_TS_NS_MASK_);
> +               if (rx->skb_head) {
> +                       hwtstamps = skb_hwtstamps(rx->skb_head);
> +                       if (hwtstamps)

This is always true.

You can just call skb_hwtstamps(skb)->hwtstamp = ktime_set(ts_sec, ts_nsec);

Though I see that this is existing code just moved due to
aforementioned indentation change.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ