lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220326122552.GA31022@wunner.de>
Date:   Sat, 26 Mar 2022 13:25:52 +0100
From:   Lukas Wunner <lukas@...ner.de>
To:     Oleksij Rempel <o.rempel@...gutronix.de>
Cc:     Andrew Lunn <andrew@...n.ch>, Oliver Neukum <oneukum@...e.com>,
        Oleksij Rempel <linux@...pel-privat.de>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Heiner Kallweit <hkallweit1@...il.com>
Subject: Re: ordering of call to unbind() in usbnet_disconnect

On Tue, Mar 15, 2022 at 12:38:41PM +0100, Oleksij Rempel wrote:
> On Tue, Mar 15, 2022 at 09:32:34AM +0100, Lukas Wunner wrote:
> > > > > > On Thu, Mar 10, 2022 at 12:25:08PM +0100, Oliver Neukum wrote:
> > > > > > > I got bug reports that 2c9d6c2b871d ("usbnet: run unbind() before
> > > > > > > unregister_netdev()") is causing regressions.
> > 
> > Is it illegal to disconnect a PHY from an unregistered, but not yet freed
> > net_device?
> > 
> > Oleksij, the commit message of 2c9d6c2b871d says that disconnecting the
> > PHY "fails" in that situation.  Please elaborate what the failure looked
> > like.  Did you get a stacktrace?

Oleksij, I cannot reproduce your stacktrace (included in full length below).
I've tested with kernel 5.13 (since the stacktrace was with 5.13-rc3)
with all your (and other people's) asix patches applied on top,
except for 2c9d6c2b871d.  Tried unplugging an AX88772A multiple times,
never got a stacktrace.

I've also walked down the code paths from usbnet_disconnect() and cannot
see how the stacktrace could occur.

Normally an unregistering netdev is removed from the linkwatch event list
(lweventlist) via this call stack:

          usbnet_disconnect()
            unregister_netdev()
              rtnl_unlock()
                netdev_run_todo()
                  netdev_wait_allrefs()
                    linkwatch_forget_dev()
                      linkwatch_do_dev()

For the stacktrace to occur, the netdev would have to be subsequently
re-added to the linkwatch event list via linkwatch_fire_event().

That is called, among other places, from netif_carrier_off().  However,
netif_carrier_off() is already called *before* linkwatch_forget_dev()
when unregister_netdev() stops the netdev before unregistering it:

          usbnet_disconnect()
            unregister_netdev()
              unregister_netdevice()
                unregister_netdevice_queue(dev, NULL)
                  unregister_netdevice_many()
                    dev_close_many()
                      __dev_close_many()
                        usbnet_stop()
                          ax88772_stop()
                            phy_stop() # state = PHY_HALTED
                              phy_state_machine()
                                phy_link_down()
                                  phy_link_change()
                                    netif_carrier_off()
                                      linkwatch_fire_event()

Again, this is *before* the subsequent call to linkwatch_forget_dev().
I don't see how netif_carrier_off() could be called once more after that.

Perhaps the stacktrace occurred with an earlier version of your patches
which lacked the call to phy_stop() in ax88772_stop()?

I'm going to submit a revert of 2c9d6c2b871d.  If you *are* able to
reproduce the stacktrace after the revert, please add a dump_stack()
to linkwatch_fire_event() so that we can find the root cause.

Thanks,

Lukas

> 
> [   15.459655] asix 2-1.2:1.0 eth1: Link is Up - 100Mbps/Full - flow control off
> [   30.600242] usb 2-1.2: USB disconnect, device number 3
> [   30.611962] asix 2-1.2:1.0 eth1: unregister 'asix' usb-ci_hdrc.1-1.2, ASIX AX88772B USB 2.0 Ethernet
> [   30.649173] asix 2-1.2:1.0 eth1 (unregistered): Failed to write reg index 0x0000: -19
> [   30.657027] asix 2-1.2:1.0 eth1 (unregistered): Failed to write Medium Mode mode to 0x0000: ffffffed
> [   30.683006] asix 2-1.2:1.0 eth1 (unregistered): Link is Down
> [   30.689512] asix 2-1.2:1.0 eth1 (unregistered): Failed to write reg index 0x0000: -19
> [   30.697359] asix 2-1.2:1.0 eth1 (unregistered): Failed to enable software MII access
> [   30.706009] asix 2-1.2:1.0 eth1 (unregistered): Failed to write reg index 0x0000: -19
> [   30.714277] asix 2-1.2:1.0 eth1 (unregistered): Failed to enable software MII access
> [   30.732689] 8<--- cut here ---
> [   30.735757] Unable to handle kernel paging request at virtual address 2e839000
> [   30.742984] pgd = af824ad7
> [   30.745695] [2e839000] *pgd=00000000
> [   30.749282] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
> [   30.754602] Modules linked in:
> [   30.757663] CPU: 0 PID: 77 Comm: kworker/0:2 Not tainted 5.13.0-rc3-00818-g06edf1a940be #2
> [   30.765934] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [   30.772466] Workqueue: events linkwatch_event
> [   30.776841] PC is at linkwatch_do_dev+0x6c/0x88
> [   30.781380] LR is at __local_bh_enable_ip+0x6c/0x100
> [   30.786356] pc : [<c08637f4>]    lr : [<c013e5c4>]    psr: 60030093
> [   30.792625] sp : c1d2bed8  ip : 00000001  fp : c28d2000
> [   30.797852] r10: c0fcf19c  r9 : c0fcf170  r8 : 00000000
> [   30.803080] r7 : c102f044  r6 : c1d2beec  r5 : 00000063  r4 : c28d2000
> [   30.809611] r3 : 00000000  r2 : 00000000  r1 : 2e839000  r0 : 60030013
> [   30.816140] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [   30.823367] Control: 10c5387d  Table: 12b54059  DAC: 00000051
> [   30.829114] Register r0 information: non-paged memory
> [   30.834174] Register r1 information: non-paged memory
> [   30.839231] Register r2 information: NULL pointer
> [   30.843941] Register r3 information: NULL pointer
> [   30.848649] Register r4 information: slab kmalloc-2k start c28d2000 pointer offset 0 size 2048
> [   30.857287] Register r5 information: non-paged memory
> [   30.862344] Register r6 information: non-slab/vmalloc memory
> [   30.868008] Register r7 information: non-slab/vmalloc memory
> [   30.873673] Register r8 information: NULL pointer
> [   30.878381] Register r9 information: non-slab/vmalloc memory
> [   30.884045] Register r10 information: non-slab/vmalloc memory
> [   30.889797] Register r11 information: slab kmalloc-2k start c28d2000 pointer offset 0 size 2048
> [   30.898516] Register r12 information: non-paged memory
> [   30.903662] Process kworker/0:2 (pid: 77, stack limit = 0xded42e9b)
> [   30.909935] Stack: (0xc1d2bed8 to 0xc1d2c000)
> [   30.914298] bec0:                                                       c28d22cc c0863a48
> [   30.922481] bee0: 00000000 c1d2a000 00000008 c1d2beec c1d2beec 73675768 c1d4da84 c0fcf170
> [   30.930663] bf00: c1ce0d80 ef6d28c0 ef6d5c00 00000000 00000000 c0fed000 ef6d28c0 c0863b8c
> [   30.938844] bf20: c0fcf170 c0155550 c1d2a000 c0a13fd4 ef6d28d8 c1ce0d80 ef6d28c0 c1ce0d94
> [   30.947026] bf40: ef6d28d8 c0f03d00 00000008 c1d2a000 ef6d28c0 c0155dc0 c1003e48 c0fec754
> [   30.955208] bf60: c1ce75e4 c1ce75c0 c1ce7fc0 c1d2a000 00000000 c1931eb4 c0155d5c c1ce0d80
> [   30.963389] bf80: c1ce75e4 c015bacc 00000000 c1ce7fc0 c015b954 00000000 00000000 00000000
> [   30.971570] bfa0: 00000000 00000000 00000000 c0100150 00000000 00000000 00000000 00000000
> [   30.979750] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   30.987931] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [   30.996114] [<c08637f4>] (linkwatch_do_dev) from [<c0863a48>] (__linkwatch_run_queue+0xe0/0x1f0)
> [   31.004917] [<c0863a48>] (__linkwatch_run_queue) from [<c0863b8c>] (linkwatch_event+0x34/0x3c)
> [   31.013540] [<c0863b8c>] (linkwatch_event) from [<c0155550>] (process_one_work+0x20c/0x5d0)
> [   31.021911] [<c0155550>] (process_one_work) from [<c0155dc0>] (worker_thread+0x64/0x570)
> [   31.030010] [<c0155dc0>] (worker_thread) from [<c015bacc>] (kthread+0x178/0x190)
> [   31.037421] [<c015bacc>] (kthread) from [<c0100150>] (ret_from_fork+0x14/0x24)
> [   31.044654] Exception stack(0xc1d2bfb0 to 0xc1d2bff8)
> [   31.049710] bfa0:                                     00000000 00000000 00000000 00000000
> [   31.057891] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   31.066071] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [   31.072692] Code: e10f0000 f10c0080 e59432c8 ee1d1f90 (e7932001) 
> [   31.078788] ---[ end trace f80581862631ce84 ]---

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ