netdev - Re: [PATCH net] net: don't let netpoll invoke NAPI if in xmit context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iL99-uzrktK=hKLrmoWhhqCnH5rNUC9K4W9cOs+CQrbdA@mail.gmail.com>
Date:   Fri, 31 Mar 2023 04:41:23 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     davem@...emloft.net, netdev@...r.kernel.org, pabeni@...hat.com,
        Roman Gushchin <roman.gushchin@...ux.dev>, leitao@...ian.org,
        shemminger@...ux.foundation.org
Subject: Re: [PATCH net] net: don't let netpoll invoke NAPI if in xmit context

On Fri, Mar 31, 2023 at 4:23 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Commit 0db3dc73f7a3 ("[NETPOLL]: tx lock deadlock fix") narrowed
> down the region under netif_tx_trylock() inside netpoll_send_skb().
> (At that point in time netif_tx_trylock() would lock all queues of
> the device.) Taking the tx lock was problematic because driver's
> cleanup method may take the same lock. So the change made us hold
> the xmit lock only around xmit, and expected the driver to take
> care of locking within ->ndo_poll_controller().
>
> Unfortunately this only works if netpoll isn't itself called with
> the xmit lock already held. Netpoll code is careful and uses
> trylock(). The drivers, however, may be using plain lock().
> Printing while holding the xmit lock is going to result in rare
> deadlocks.
>
> Luckily we record the xmit lock owners, so we can scan all the queues,
> the same way we scan NAPI owners. If any of the xmit locks is held
> by the local CPU we better not attempt any polling.
>
> It would be nice if we could narrow down the check to only the NAPIs
> and the queue we're trying to use. I don't see a way to do that now.
>
> Reported-by: Roman Gushchin <roman.gushchin@...ux.dev>
> Fixes: 0db3dc73f7a3 ("[NETPOLL]: tx lock deadlock fix")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> CC: leitao@...ian.org
> CC: shemminger@...ux.foundation.org
> ---
>  net/core/netpoll.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index a089b704b986..e6a739b1afa9 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -137,6 +137,20 @@ static void queue_process(struct work_struct *work)
>         }
>  }
>
> +static int netif_local_xmit_active(struct net_device *dev)
> +{
> +       int i;
> +
> +       for (i = 0; i < dev->num_tx_queues; i++) {
> +               struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
> +
> +               if (READ_ONCE(txq->xmit_lock_owner) == smp_processor_id())
> +                       return 1;
> +       }
> +

Resend in plain text mode for the list, sorry for duplicate

Note that we update WRITE_ONCE(txq->xmit_lock_owner, cpu) _after_
spin_lock(&txq->_xmit_lock);

So there is a tiny window I think, for missing that we got the
spinlock, but I do not see how to avoid it without excessive cost.


> +       return 0;
> +}
> +
>  static void poll_one_napi(struct napi_struct *napi)
>  {
>         int work;
> @@ -183,7 +197,10 @@ void netpoll_poll_dev(struct net_device *dev)
>         if (!ni || down_trylock(&ni->dev_lock))
>                 return;
>
> -       if (!netif_running(dev)) {
> +       /* Some drivers will take the same locks in poll and xmit,
> +        * we can't poll if local CPU is already in xmit.
> +        */
> +       if (!netif_running(dev) || netif_local_xmit_active(dev)) {
>                 up(&ni->dev_lock);
>                 return;
>         }
> --
> 2.39.2
>