netdev - Re: [PATCH net-next v3 2/7] net: lan966x: Split lan966x_fdb_event

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220702140834.gyqmtmaru6ecdamb@skbuf>
Date:   Sat, 2 Jul 2022 14:08:34 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Horatiu Vultur <horatiu.vultur@...rochip.com>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "UNGLinuxDriver@...rochip.com" <UNGLinuxDriver@...rochip.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "pabeni@...hat.com" <pabeni@...hat.com>
Subject: Re: [PATCH net-next v3 2/7] net: lan966x: Split
 lan966x_fdb_event_work

On Fri, Jul 01, 2022 at 10:52:22PM +0200, Horatiu Vultur wrote:
> Split the function lan966x_fdb_event_work. One case for when the
> orig_dev is a bridge and one case when orig_dev is lan966x port.
> This is preparation for lag support. There is no functional change.
> 
> Signed-off-by: Horatiu Vultur <horatiu.vultur@...rochip.com>
> ---

> -static void lan966x_fdb_event_work(struct work_struct *work)
> +void lan966x_fdb_flush_workqueue(struct lan966x *lan966x)
> +{
> +	flush_workqueue(lan966x->fdb_work);
> +}
> +

> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> index df2bee678559..d9fc6a9a3da1 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> @@ -320,9 +320,10 @@ static int lan966x_port_prechangeupper(struct net_device *dev,
>  {
>  	struct lan966x_port *port = netdev_priv(dev);
>  
> -	if (netif_is_bridge_master(info->upper_dev) && !info->linking)
> -		switchdev_bridge_port_unoffload(port->dev, port,
> -						NULL, NULL);
> +	if (netif_is_bridge_master(info->upper_dev) && !info->linking) {
> +		switchdev_bridge_port_unoffload(port->dev, port, NULL, NULL);
> +		lan966x_fdb_flush_workqueue(port->lan966x);
> +	}

Very curious as to why you decided to stuff this change in here.
There was no functional change in v2, now there is. And it's a change
you might need to come back to later (probably sooner than you'd like),
since the flushing of the workqueue is susceptible to causing deadlocks
if done improperly - let's see how you blame a commit that was only
supposed to move code, in that case ;)

The deadlock that I'm talking about comes from the fact that
lan966x_port_prechangeupper() runs with rtnl_lock() held. So the code of
the flushed workqueue item must not hold rtnl_lock(), or any other lock
that is blocked by the rtnl_lock(). Otherwise, the flushing will wait
for a workqueue item to complete, that in turn waits to acquire the
rtnl_lock, which is held by the thread waiting the workqueue to complete.

Analyzing your code, lan966x_mac_notifiers() takes rtnl_lock().
That is taken from threaded interrupt context - lan966x_mac_irq_process(),
but is a sub-lock of spin_lock(&lan966x->mac_lock).

There are 2 problems with that already: rtnl_lock() is a mutex => can
sleep, but &lan966x->mac_lock is a spin lock => is atomic. You can't
take rtnl_lock() from atomic context. Lockdep and/or CONFIG_DEBUG_ATOMIC_SLEEP
will tell you so much.

The second problem is the lock ordering inversion that this causes.
There exists a threaded IRQ which takes the locks in the order mac_lock
-> rtnl_lock, and there exists this new fdb_flush_workqueue which takes
the locks in the order rtnl_lock -> mac_lock. If they run at the same
time, kaboom. Again, lockdep will tell you as much.

I'm sorry, but you need to solve the existing locking problems with the
code first.

>  
>  	return NOTIFY_DONE;
>  }
> -- 
> 2.33.0
>