lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <697f7861-1365-2f1c-e983-ddbc661df1e4@gmail.com>
Date:   Wed, 30 Aug 2017 17:16:39 -0700
From:   David Daney <ddaney.cavm@...il.com>
To:     David Miller <davem@...emloft.net>, f.fainelli@...il.com
Cc:     netdev@...r.kernel.org, andrew@...n.ch, slash.tmp@...e.fr,
        marc_gonzalez@...madesigns.com, rmk+kernel@...linux.org.uk
Subject: Re: [PATCH net v2] net: phy: Correctly process PHY_HALTED in
 phy_stop_machine()

And of course I mess up my pretty picture, see below.

On 08/30/2017 05:13 PM, David Daney wrote:
> On 07/31/2017 05:28 PM, David Miller wrote:
>> From: Florian Fainelli <f.fainelli@...il.com>
>> Date: Fri, 28 Jul 2017 11:58:36 -0700
>>
>>> Marc reported that he was not getting the PHY library adjust_link()
>>> callback function to run when calling phy_stop() + phy_disconnect()
>>> which does not indeed happen because we set the state machine to
>>> PHY_HALTED but we don't get to run it to process this state past that
>>> point.
>>>
>>> Fix this with a synchronous call to phy_state_machine() in order to have
>>> the state machine actually act on PHY_HALTED, set the PHY device's link
>>> down, turn the network device's carrier off and finally call the
>>> adjust_link() function.
>>>
>>> Reported-by: Marc Gonzalez <marc_gonzalez@...madesigns.com>
>>> Fixes: a390d1f379cf ("phylib: convert state_queue work to delayed_work")
>>> Signed-off-by: Florian Fainelli <f.fainelli@...il.com>
>>> ---
>>> Changes in v2:
>>>
>>> - reword subject and commit message based on changes
>>> - dropped flush_scheduled_work() since it is redundant
>>
>> Applied and queued up for -stable, thanks.
>>
> 
> 
> This is broken.  Please revert.
> 
> Upstream commit 7ad813f20853 and in the stable branches as well.
> 
> When ndo_stop() is called we call:
> 
> 
>   phy_disconnect()
>      +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;
>      +---> phy_stop_machine()
>      |      +---> phy_stop_machine()

s/phy_stop_machine/phy_state_machine/

The call that the offending patch adds.


>      |              +----> queue_delayed_work(): Work queued.
>      +--->phy_detach() implies: phydev->attached_dev = NULL;
> 
> Now at a later time the queued work does:
> 
>   phy_state_machine()
>      +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:
> 
> 
>   CPU 12 Unable to handle kernel paging request at virtual address
> 0000000000000048, epc == ffffffff80de37ec, ra == ffffffff80c7c
> Oops[#1]:
> CPU: 12 PID: 1502 Comm: kworker/12:1 Not tainted 4.9.43-Cavium-Octeon+ #1
> Workqueue: events_power_efficient phy_state_machine
> task: 80000004021ed100 task.stack: 8000000409d70000
> $ 0   : 0000000000000000 ffffffff84720060 0000000000000048 0000000000000004
> $ 4   : 0000000000000000 0000000000000001 0000000000000004 0000000000000000
> $ 8   : 0000000000000000 0000000000000000 00000000ffff98f3 0000000000000000
> $12   : 8000000409d73fe0 0000000000009c00 ffffffff846547c8 000000000000af3b
> $16   : 80000004096bab68 80000004096babd0 0000000000000000 80000004096ba800
> $20   : 0000000000000000 0000000000000000 ffffffff81090000 0000000000000008
> $24   : 0000000000000061 ffffffff808637b0
> $28   : 8000000409d70000 8000000409d73cf0 80000000271bd300 ffffffff80c7804c
> Hi    : 000000000000002a
> Lo    : 000000000000003f
> epc   : ffffffff80de37ec netif_carrier_off+0xc/0x58
> ra    : ffffffff80c7804c phy_state_machine+0x48c/0x4f8
> Status: 14009ce3        KX SX UX KERNEL EXL IE
> Cause : 00800008 (ExcCode 02)
> BadVA : 0000000000000048
> PrId  : 000d9501 (Cavium Octeon III)
> Modules linked in:
> Process kworker/12:1 (pid: 1502, threadinfo=8000000409d70000,
> task=80000004021ed100, tls=0000000000000000)
> Stack : 8000000409a54000 80000004096bab68 80000000271bd300 80000000271c1e00
>          0000000000000000 ffffffff808a1708 8000000409a54000 
> 80000000271bd300
>          80000000271bd320 8000000409a54030 ffffffff80ff0f00 
> 0000000000000001
>          ffffffff81090000 ffffffff808a1ac0 8000000402182080 
> ffffffff84650000
>          8000000402182080 ffffffff84650000 ffffffff80ff0000 
> 8000000409a54000
>          ffffffff808a1970 0000000000000000 80000004099e8000 
> 8000000402099240
>          0000000000000000 ffffffff808a8598 0000000000000000 
> 8000000408eeeb00
>          8000000409a54000 00000000810a1d00 0000000000000000 
> 8000000409d73de8
>          8000000409d73de8 0000000000000088 000000000c009c00 
> 8000000409d73e08
>          8000000409d73e08 8000000402182080 ffffffff808a84d0 
> 8000000402182080
>          ...
> Call Trace:
> [<ffffffff80de37ec>] netif_carrier_off+0xc/0x58
> [<ffffffff80c7804c>] phy_state_machine+0x48c/0x4f8
> [<ffffffff808a1708>] process_one_work+0x158/0x368
> [<ffffffff808a1ac0>] worker_thread+0x150/0x4c0
> [<ffffffff808a8598>] kthread+0xc8/0xe0
> [<ffffffff808617f0>] ret_from_kernel_thread+0x14/0x1c

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ