lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240913202916.t7bpdc6ubfdpv47s@skbuf>
Date: Fri, 13 Sep 2024 23:29:16 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: "Sverdlin, Alexander" <alexander.sverdlin@...mens.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"vladimir.oltean@....com" <vladimir.oltean@....com>,
	"andrew@...n.ch" <andrew@...n.ch>,
	"vivien.didelot@...il.com" <vivien.didelot@...il.com>,
	"claudiu.manoil@....com" <claudiu.manoil@....com>,
	"woojung.huh@...rochip.com" <woojung.huh@...rochip.com>,
	"dqfext@...il.com" <dqfext@...il.com>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"UNGLinuxDriver@...rochip.com" <UNGLinuxDriver@...rochip.com>,
	"alexandre.belloni@...tlin.com" <alexandre.belloni@...tlin.com>,
	"linux-mediatek@...ts.infradead.org" <linux-mediatek@...ts.infradead.org>,
	"george.mccollister@...il.com" <george.mccollister@...il.com>,
	"linux@...linux.org.uk" <linux@...linux.org.uk>,
	"linux@...pel-privat.de" <linux@...pel-privat.de>,
	"linus.walleij@...aro.org" <linus.walleij@...aro.org>,
	"hkallweit1@...il.com" <hkallweit1@...il.com>,
	"hauke@...ke-m.de" <hauke@...ke-m.de>,
	"LinoSanfilippo@....de" <LinoSanfilippo@....de>,
	"kuba@...nel.org" <kuba@...nel.org>,
	"sean.wang@...iatek.com" <sean.wang@...iatek.com>,
	"kurt@...utronix.de" <kurt@...utronix.de>,
	"m.grzeschik@...gutronix.de" <m.grzeschik@...gutronix.de>,
	"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"matthias.bgg@...il.com" <matthias.bgg@...il.com>,
	"Landen.Chao@...iatek.com" <Landen.Chao@...iatek.com>,
	"f.fainelli@...il.com" <f.fainelli@...il.com>
Subject: Re: [PATCH v2 net 2/5] net: dsa: be compatible with masters which
 unregister on shutdown

Hi Alexander,

On Wed, Sep 04, 2024 at 08:31:13AM +0000, Sverdlin, Alexander wrote:
> > +static void lan9303_mdio_shutdown(struct mdio_device *mdiodev)
> > +{
> > +	struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev);
> > +
> > +	if (!sw_dev)
> > +		return;
> > +
> > +	lan9303_shutdown(&sw_dev->chip);
> > +
> > +	dev_set_drvdata(&mdiodev->dev, NULL);
> >  }
> 
> This unfortunately didn't work well with LAN9303 and probably will not work
> with others:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G           O       6.1.99+gitb7793b7d9b35 #1
> Workqueue: events_power_efficient phy_state_machine
> pc : lan9303_mdio_phy_read+0x1c/0x34
> lr : lan9303_phy_read+0x50/0x100
> Call trace:
>  lan9303_mdio_phy_read+0x1c/0x34
>  lan9303_phy_read+0x50/0x100
>  dsa_slave_phy_read+0x40/0x50
>  __mdiobus_read+0x34/0x130
>  mdiobus_read+0x44/0x70
>  genphy_update_link+0x2c/0x104
>  genphy_read_status+0x2c/0x120
>  phy_check_link_status+0xb8/0xcc
>  phy_state_machine+0x198/0x27c
>  process_one_work+0x1dc/0x450
>  worker_thread+0x154/0x450
> 
> as long as the ports are not down (and dsa_switch_shutdown() doesn't ensure it),
> we cannot just zero drvdata, because PHY polling will eventually call
> 
> static int lan9303_mdio_phy_read(struct lan9303 *chip, int addr, int reg)
> {
>         struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev);
> 
>         return mdiobus_read_nested(sw_dev->device->bus, addr, reg);
> 
> There are however multiple other unsafe patterns.
> I suppose current
> 
> dsa_switch_shutdown();
> dev_set_drvdata(...->dev, NULL);
> 
> pattern is broken in many cases...

Unfortunately the code portion which you've quoted for your reply does not
show the full story. dsa_switch_shutdown(), at the time of this patch,
was implemented like this (stripped of comments):

void dsa_switch_shutdown(struct dsa_switch *ds)
{
	struct net_device *master, *slave_dev;
	LIST_HEAD(unregister_list);
	struct dsa_port *dp;

	mutex_lock(&dsa2_mutex);
	rtnl_lock();

	list_for_each_entry(dp, &ds->dst->ports, list) {
		if (dp->ds != ds)
			continue;

		if (!dsa_port_is_user(dp))
			continue;

		master = dp->cpu_dp->master;
		slave_dev = dp->slave;

		netdev_upper_dev_unlink(master, slave_dev);
		unregister_netdevice_queue(slave_dev, &unregister_list);
	}
	unregister_netdevice_many(&unregister_list);

	rtnl_unlock();
	mutex_unlock(&dsa2_mutex);
}

I believe you would be wrong to blame this patch for exiting with the
slave/user ports still running (and thus ds->ops->phy_read() still
callable), because, as you can see, it doesn't do that - it unregisters
them, which also stops the net_device prior. So, both phylink_stop() and
phylink_destroy() would be called.

The patch had other problems though, and that led to the rework in
commit ee534378f005 ("net: dsa: fix panic when DSA master device unbinds
on shutdown"), rework which is in fact to blame for what you're reporting.

Given that we are talking about a fix to a fix, it doesn't really matter
in terms of backporting targets which one it is, but for correctness sake,
it is the later patch that fixed some things while introducing the race
condition.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ