netdev - Re: [PATCH net-next] net: Avoid spurious rx_dropped increases with tap and rx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iLdfOzRuhC--MALZuTDSoU6ncX7Xu_0iJnjZs1-9_gwmQ@mail.gmail.com>
Date:   Sun, 24 May 2020 22:41:26 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Benjamin Poirier <bpoirier@...ulusnetworks.com>
Cc:     netdev <netdev@...r.kernel.org>,
        Nikolay Aleksandrov <nikolay@...ulusnetworks.com>,
        Jiri Pirko <jiri@...nulli.us>
Subject: Re: [PATCH net-next] net: Avoid spurious rx_dropped increases with
 tap and rx_handler

On Sun, May 24, 2020 at 10:02 PM Benjamin Poirier
<bpoirier@...ulusnetworks.com> wrote:
>
> Consider an skb which doesn't match a ptype_base/ptype_specific handler. If
> this skb is delivered to a ptype_all handler, it does not count as a drop.
> However, if the skb is also processed by an rx_handler which returns
> RX_HANDLER_PASS, the frame is now counted as a drop because pt_prev was
> reset. An example of this situation is an LLDP frame received on a bridge
> port while lldpd is listening on a packet socket with ETH_P_ALL (ex. by
> specifying `lldpd -c`).
>
> Fix by adding an extra condition variable to record if the skb was
> delivered to a packet tap before running an rx_handler.
>
> The situation is similar for RX_HANDLER_EXACT frames so their accounting is
> also changed. OTOH, the behavior is unchanged for RX_HANDLER_ANOTHER frames
> - they are accounted according to what happens with the new skb->dev.
>
> Fixes: caf586e5f23c ("net: add a core netdev->rx_dropped counter")

I disagree.

> Message-Id: <20200522011420.263574-1-bpoirier@...ulusnetworks.com>
> Signed-off-by: Benjamin Poirier <bpoirier@...ulusnetworks.com>
> ---
>  net/core/dev.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ae37586f6ee8..07957a0f57e6 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5061,11 +5061,11 @@ static inline int nf_ingress(struct sk_buff *skb, struct packet_type **pt_prev,
>  static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
>                                     struct packet_type **ppt_prev)
>  {
> +       bool deliver_exact = false, rx_tapped = false;
>         struct packet_type *ptype, *pt_prev;
>         rx_handler_func_t *rx_handler;
>         struct sk_buff *skb = *pskb;
>         struct net_device *orig_dev;
> -       bool deliver_exact = false;
>         int ret = NET_RX_DROP;
>         __be16 type;
>
> @@ -5158,12 +5158,14 @@ static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
>                 if (pt_prev) {
>                         ret = deliver_skb(skb, pt_prev, orig_dev);
>                         pt_prev = NULL;
> +                       rx_tapped = true;
>                 }
>                 switch (rx_handler(&skb)) {
>                 case RX_HANDLER_CONSUMED:
>                         ret = NET_RX_SUCCESS;
>                         goto out;
>                 case RX_HANDLER_ANOTHER:
> +                       rx_tapped = false;
>                         goto another_round;
>                 case RX_HANDLER_EXACT:
>                         deliver_exact = true;
> @@ -5234,11 +5236,13 @@ static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
>                         goto drop;
>                 *ppt_prev = pt_prev;
>         } else {
> +               if (!rx_tapped) {
>  drop:
> -               if (!deliver_exact)
> -                       atomic_long_inc(&skb->dev->rx_dropped);
> -               else
> -                       atomic_long_inc(&skb->dev->rx_nohandler);
> +                       if (!deliver_exact)
> +                               atomic_long_inc(&skb->dev->rx_dropped);
> +                       else
> +                               atomic_long_inc(&skb->dev->rx_nohandler);
> +               }

This does not make sense to me.

Here we call kfree_skb() meaning this packet is _dropped_.
I understand it does not please some people, because they do not
always understand the meaning of this counter, but it is a mere fact.

Fact that a packet capture made a clone of this packet should not
matter, tcpdump should not hide that a packet is _dropped_.

>                 kfree_skb(skb);
>                 /* Jamal, now you will not able to escape explaining
>                  * me how you were going to use this. :-)


Can we please not add code in the fast path ?

Why are LLDP packets so special, why are they not consumed ?

I would suggest fixing the layer that handles LLDP packets so that we
do not have to workaround this issue in a critical fast path.

Thanks.