lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Oct 2020 19:21:25 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Christian Eggers <ceggers@...i.de>
Cc:     Woojung Huh <woojung.huh@...rochip.com>,
        Microchip Linux Driver Support <UNGLinuxDriver@...rochip.com>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [net v2] net: dsa: microchip: fix race condition

On Tue, Oct 06, 2020 at 05:56:51PM +0200, Christian Eggers wrote:
> Between queuing the delayed work and finishing the setup of the dsa
> ports, the process may sleep in request_module() (via
> phy_device_create()) and the queued work may be executed prior to the
> switch net devices being registered. In ksz_mib_read_work(), a NULL
> dereference will happen within netof_carrier_ok(dp->slave).
> 
> Not queuing the delayed work in ksz_init_mib_timer() makes things even
> worse because the work will now be queued for immediate execution
> (instead of 2000 ms) in ksz_mac_link_down() via
> dsa_port_link_register_of().
> 
> Call tree:
> ksz9477_i2c_probe()
> \--ksz9477_switch_register()
>    \--ksz_switch_register()
>       +--dsa_register_switch()
>       |  \--dsa_switch_probe()
>       |     \--dsa_tree_setup()
>       |        \--dsa_tree_setup_switches()
>       |           +--dsa_switch_setup()
>       |           |  +--ksz9477_setup()
>       |           |  |  \--ksz_init_mib_timer()
>       |           |  |     |--/* Start the timer 2 seconds later. */
>       |           |  |     \--schedule_delayed_work(&dev->mib_read, msecs_to_jiffies(2000));
>       |           |  \--__mdiobus_register()
>       |           |     \--mdiobus_scan()
>       |           |        \--get_phy_device()
>       |           |           +--get_phy_id()
>       |           |           \--phy_device_create()
>       |           |              |--/* sleeping, ksz_mib_read_work() can be called meanwhile */
>       |           |              \--request_module()
>       |           |
>       |           \--dsa_port_setup()
>       |              +--/* Called for non-CPU ports */
>       |              +--dsa_slave_create()
>       |              |  +--/* Too late, ksz_mib_read_work() may be called beforehand */
>       |              |  \--port->slave = ...
>       |             ...
>       |              +--Called for CPU port */
>       |              \--dsa_port_link_register_of()
>       |                 \--ksz_mac_link_down()
>       |                    +--/* mib_read must be initialized here */
>       |                    +--/* work is already scheduled, so it will be executed after 2000 ms */
>       |                    \--schedule_delayed_work(&dev->mib_read, 0);
>       \-- /* here port->slave is setup properly, scheduling the delayed work should be safe */
> 
> Solution:
> 1. Do not queue (only initialize) delayed work in ksz_init_mib_timer().
> 2. Only queue delayed work in ksz_mac_link_down() if init is completed.
> 3. Queue work once in ksz_switch_register(), after dsa_register_switch()
> has completed.
> 
> Fixes: 7c6ff470aa ("net: dsa: microchip: add MIB counter reading support")
> Signed-off-by: Christian Eggers <ceggers@...i.de>
> ---

Reviewed-by: Vladimir Oltean <olteanv@...il.com>

You forgot to copy Florian's review tag from v1.

> v2:
> ---------
> - no changes in the patch itself
> - use correct subject-prefix
> - changed wording of commit description
> - added call tree to commit description
> - added "Fixes:" tag
> 
[...]
> 		/* Only read MIB counters when the port is told to do.
> 		 * If not, read only dropped counters when link is not up.
> 		 */
> port_r_cnt() is called independently of p->read and netif_carrier_ok()... What
> is correct here (comment or code)?

port_r_cnt() iterates with mib->cnt_ptr through 2 loops.
Check how mib->cnt_ptr is set before port_r_ctr is called.

> I needed some amount of time to understand the segfault and to draw the
> call stack...

I'm sure you did.

> I am definitely not an expert for this driver. For starting/stopping the
> delayed work on demand, a separate work struct for each port could be useful.
> In this case, struct ksz_port would need a pointer to the ksz_device struct,
> as the ports are allocated seperately and container_of() cannot be used.

Me neither, I'm just a spectator.

> Using a bool variable has the property, that reading the MIB will not be
> performed "immediately" after phylink_mac_down(). But if I am correct, this
> is also not the case today as the work is typically already queued when
> ksz_mac_link_down() is executed.
> 
> - First call of ksz_mac_link_down:
> Work is already queued (prior this patch) or will not be queued (after this
> patch).
> 
> - Further calls:
> Work is already queued (it requeues itself).
> 
> Result (please verify):

I can't verify this. Please ask the Microchip people. But the fix makes
sense.

> - Not scheduling the work in ksz_mac_link_down() won't change anything.
> - Checking for mib_read_interval in ksz_switch_remove() can be obmitted,
>   as the condition is always true when ksz_switch_remove() is called.

If there's an error in the probe path, I expect that the
mib_read_interval will not get set, and the delayed workqueue will not
be scheduled, will it? So I think the check is ok there.

Thanks,
-Vladimir

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ