netdev - Re: [PATCH net-next] net: Add lockdep asserts to ____napi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHmME9oHFzL6CYVh8nLGkNKOkMeWi2gmxs_f7S8PATWwc6uQsw@mail.gmail.com>
Date:   Fri, 18 Mar 2022 12:19:45 -0600
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     Netdev <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Toke Høiland-Jørgensen <toke@...hat.com>
Subject: Re: [PATCH net-next] net: Add lockdep asserts to ____napi_schedule().

Hi Sebastian,

On Fri, Mar 18, 2022 at 4:57 AM Sebastian Andrzej Siewior
<bigeasy@...utronix.de> wrote:
> > Hi Sebastian,
> Hi Jason,
>
> > I stumbled upon this commit when noticing a new failure in WireGuard's
> > test suite:
> …
> > [    1.339289] WARNING: CPU: 0 PID: 11 at ../../../../../../../../net/core/dev.c:4268 __napi_schedule+0xa1/0x300
> …
> > [    1.352417]  wg_packet_decrypt_worker+0x2ac/0x470
> …
> > Sounds like wg_packet_decrypt_worker() might be doing something wrong? I
> > vaguely recall a thread where you started looking into some things there
> > that seemed non-optimal, but I didn't realize there were correctness
> > issues. If your patch is correct, and wg_packet_decrypt_worker() is
> > wrong, do you have a concrete idea of how we should approach fixing
> > wireguard? Or do you want to send a patch for that?
>
> In your case it is "okay" since that ptr_ring_consume_bh() will do BH
> disable/enable which forces the softirq to run. It is not obvious.

In that case, isn't the lockdep assertion you added wrong and should
be reverted? If correct code is hitting it, something seems wrong...

> What
> about the following:
>
> diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
> index 7b8df406c7737..26ffa3afa542e 100644
> --- a/drivers/net/wireguard/receive.c
> +++ b/drivers/net/wireguard/receive.c
> @@ -502,15 +502,21 @@ void wg_packet_decrypt_worker(struct work_struct *work)
>         struct crypt_queue *queue = container_of(work, struct multicore_worker,
>                                                  work)->ptr;
>         struct sk_buff *skb;
> +       unsigned int packets = 0;
>
> -       while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) {
> +       local_bh_disable();
> +       while ((skb = ptr_ring_consume(&queue->ring)) != NULL) {
>                 enum packet_state state =
>                         likely(decrypt_packet(skb, PACKET_CB(skb)->keypair)) ?
>                                 PACKET_STATE_CRYPTED : PACKET_STATE_DEAD;
>                 wg_queue_enqueue_per_peer_rx(skb, state);
> -               if (need_resched())
> +               if (!(++packets % 4)) {
> +                       local_bh_enable();
>                         cond_resched();
> +                       local_bh_disable();
> +               }
>         }
> +       local_bh_enable();
>  }
>
>  static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
>
> It would decrypt 4 packets in a row and then after local_bh_enable() it
> would invoke wg_packet_rx_poll() (assuming since it is the only napi
> handler in wireguard) and after that it will attempt cond_resched() and
> then continue with the next batch.

I'm willing to consider batching and all sorts of heuristics in there,
though probably for 5.19 rather than 5.18. Indeed there's some
interesting optimization work to be done. But if you want to propose a
change like this, can you send some benchmarks with it, preferably
taken with something like flent so we can see if it negatively affects
latency?

Regards,
Jason