lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220218104330.g3vfbpdqltdkp4sr@skbuf>
Date:   Fri, 18 Feb 2022 12:43:30 +0200
From:   Vladimir Oltean <olteanv@...il.com>
To:     Oleksij Rempel <o.rempel@...gutronix.de>
Cc:     Woojung Huh <woojung.huh@...rochip.com>,
        UNGLinuxDriver@...rochip.com, Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, kernel@...gutronix.de,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v1 1/1] net: dsa: microchip: ksz9477: export HW
 stats over stats64 interface

On Fri, Feb 18, 2022 at 09:39:56AM +0100, Oleksij Rempel wrote:
> On Thu, Feb 17, 2022 at 05:55:54PM +0200, Vladimir Oltean wrote:
> > On Thu, Feb 17, 2022 at 03:07:26PM +0100, Oleksij Rempel wrote:
> > > +static void ksz9477_get_stats64(struct dsa_switch *ds, int port,
> > > +				struct rtnl_link_stats64 *stats)
> > > +{
> > > +	struct ksz_device *dev = ds->priv;
> > > +	struct ksz9477_stats_raw *raw;
> > > +	struct ksz_port_mib *mib;
> > > +	int ret;
> > > +
> > > +	mib = &dev->ports[port].mib;
> > > +	raw = (struct ksz9477_stats_raw *)mib->counters;
> > > +
> > > +	mutex_lock(&mib->cnt_mutex);
> > 
> > The eternal problem, ndo_get_stats64 runs in atomic context,
> > mutex_lock() sleeps. Please test your patches with
> > CONFIG_DEBUG_ATOMIC_SLEEP=y.
> 
> Good point, thx! I reworked the code.
> 
> Beside, I get this warning with differnt locking validators:
> [  153.140000] br0: port 1(lan2) entered blocking state
> [  153.190000] br0: port 1(lan2) entered disabled state
> [  153.320000] device lan2 entered promiscuous mode
> [  153.350000] ------------[ cut here ]------------
> [  153.350000] WARNING: CPU: 0 PID: 71 at net/core/dev.c:7913 __dev_set_promiscuity+0x10c/0x138
> [  153.360000] RTNL: assertion failed at net/core/dev.c (7913)
> [  153.370000] Modules linked in: atmel_aes atmel_tdes atmel_sha
> [  153.370000] CPU: 0 PID: 71 Comm: kworker/u2:5 Not tainted 5.17.0-rc2-00714-g845f6fa17e48-dirty #33
> [  153.370000] Hardware name: Atmel SAMA5
> [  153.370000] Workqueue: dsa_ordered dsa_slave_switchdev_event_work
> [  153.370000]  unwind_backtrace from show_stack+0x18/0x1c
> [  153.370000]  show_stack from dump_stack_lvl+0x58/0x70
> [  153.370000]  dump_stack_lvl from __warn+0xd8/0x228
> [  153.370000]  __warn from warn_slowpath_fmt+0x98/0xc8
> [  153.370000]  warn_slowpath_fmt from __dev_set_promiscuity+0x10c/0x138
> [  153.370000]  __dev_set_promiscuity from __dev_set_rx_mode+0x8c/0x98
> [  153.370000]  __dev_set_rx_mode from dev_uc_add+0x84/0x8c
> [  153.370000]  dev_uc_add from dsa_port_host_fdb_add+0x48/0x80
> [  153.370000]  dsa_port_host_fdb_add from dsa_slave_switchdev_event_work+0x1dc/0x254
> [  153.370000]  dsa_slave_switchdev_event_work from process_one_work+0x2b0/0x7d4
> [  153.370000]  process_one_work from worker_thread+0x4c/0x53c
> [  153.370000]  worker_thread from kthread+0xf8/0x12c
> [  153.370000]  kthread from ret_from_fork+0x14/0x2c
> [  153.370000] Exception stack(0xc1f13fb0 to 0xc1f13ff8)
> [  153.370000] 3fa0:                                     00000000 00000000 00000000 00000000
> [  153.370000] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [  153.370000] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [  153.380000] irq event stamp: 0
> [  153.390000] hardirqs last  enabled at (0): [<00000000>] 0x0
> [  153.390000] hardirqs last disabled at (0): [<c0124b38>] copy_process+0x7d8/0x194c
> [  153.400000] softirqs last  enabled at (0): [<c0124b38>] copy_process+0x7d8/0x194c
> [  153.410000] softirqs last disabled at (0): [<00000000>] 0x0
> [  153.410000] ---[ end trace 0000000000000000 ]---
> [  153.420000] device eth0 entered promiscuous mode
> [  153.770000] ksz9477-switch spi1.0 lan2: configuring for phy/gmii link mode
> [  155.040000] ksz9477-switch spi1.0 lan4: Link is Down
> [  156.960000] ksz9477-switch spi1.0 lan2: Link is Up - 1Gbps/Full - flow control rx/tx
> 
> 
> Is it something known?

Thanks for reporting. It isn't something known (at least to me - who
knows whether somebody may have been chuckling while I was gloating that
I removed rtnl_lock() from the DSA switchdev FDB notifiers).

It turns out that dsa_port_host_fdb_add() needs rtnl_lock() around
dev_uc_add(), at least if the master doesn't support IFF_UNICAST_FLT.

It's pretty nasty. If we add an rtnl_lock() there, we'll deadlock from
the code paths that call dsa_flush_workqueue(), which have rtnl_lock()
already held.

The only way I think we can get out of this is if we call dev_uc_add()
only if the master has IFF_UNICAST_FLT. If it doesn't, we should force a
call to dsa_master_set_promiscuity() during dsa_master_setup().
Not exactly the cleanest solution, but at least it shouldn't deadlock
and it should work.

Andrew, Florian, Vivien?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ