lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o83eg170.fsf@bang-olufsen.dk>
Date:   Thu, 10 Feb 2022 16:01:55 +0000
From:   Alvin Šipraga <ALSI@...g-olufsen.dk>
To:     Luiz Angelo Daros de Luca <luizluca@...il.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Linus Walleij <linus.walleij@...aro.org>,
        Arınç ÜNAL <arinc.unal@...nc9.com>
Subject: Re: net: dsa: realtek: silent indirect reg read failures on SMP

Hi again Luiz,

Alvin Šipraga <ALSI@...g-olufsen.dk> writes:

> Hi Luiz,
>
> Thanks for the info.
>
> Luiz Angelo Daros de Luca <luizluca@...il.com> writes:
>

<snip>

>>
>> Sorry but I believe my non-SMP device is not "affected enough" to help
>> debug this issue. Is your device SMP?
>
> Yes, it's SMP and also with PREEMPT_RT ;-)
>
> But no problem, I will look into this and get back to you in this thread
> if I need any more feedback. Thanks for your help.

So I had a closer look this afternoon. First some general statements.

1. Generally the switch driver is not accessing registers unless an
   administrator is reconfiguring the interfaces
2. The exception to the above rule is the delayed_work for stats64,
   which you can see in the code is there because get_stats64 runs in
   atomic context. This runs every 3 seconds.
3. Without the interrupt from the switch enabled, the PHY subsystem
   resorts to polling the PHY registers (every 1 second I think you
   said).

As a result of (2) and (3) above, you are bound at some point to have
both the stats work and the PHY register read callbacks executing in
parallel. As I mentioned in my last email, a simple busy loop with
phytool would return some nonsense pretty quickly. On my SMP/PREEMPT_RT
system this happens every 3 seconds while everything else is idle.

I tried to disable the stats delayed_work just to see, and in this case
I did not observe any PHY read issues. The PHY register value was always
as expected.

In that setup I then tried to provoke the error again, this time by
reading a single register with dd via regmap debugfs. And while it's
unlikely for a single such read to interfere with my busy phytool loop,
putting the dd read in a tight loop almost immediately provokes the same
bug. This time I noticed that the value returned by phytool is the same
value that I read out with dd from the other non-PHY-related register.

In general what I found is that if we read from an arbitrary register A
and this read coincides with the multi-register access in
rtl8365mb_phy_ocp_read, then the final read from
RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG will always return the value in
register A. It is quite reliably the case in all my testing, and it
explains the nonsense values we sometimes happened to see during PHY
register access, because of the stats work going on in the
background. Probably we got some MIB counter data and this corrupted the
PHY register value held in the INDIRECT_ACCESS_READ_DATA_REG register.

I am not sure why this happens - likely some peculiarity of the ASIC
hardware - but I wanted to check if this is also the case for the MIB
register access, because we also have a kind of indirect lookup
algorithm there. But in that case I did not see any corruption of the
data in the MIB counter data registers (RTL8365MB_MIB_COUNTER_REG(_x)).

So my conclusion is that this problem is unique to the indirect PHY
register access pattern. It should be pointed out that the regmap is
already protected by a lock, so I don't expect to see any weird data
races for non-PHY register access.

One more thing I wanted to point out: you mentioned that on your system
you conducted multiple phytool read loops and did not observe any
issues. I think this is easily explained by some higher-level locking
in the kernel which prevents concurrent PHY register reads.

****

With all of that said, I think the solution here is simply to guard
against stray register access while we are in the indirect PHY register
access callbacks. You also posted a patch to forego the whole indirect
access method for MDIO-connected switches, and I think that is also a
good thing. My reply to that patch was just taking issue with your
explanation, both because the diagnosis of the bug was rather nebulous,
and also because it did not actually fix the bug - it just worked around
it.

I will take it upon myself to fix this issue of indirect PHY register
access yielding corrupt values and post a patch to the list next week.
I already have a quick-n-dirty change which ensures proper locking and
now I cannot reproduce the issue for several hours.

In the mean time, could you resend your MDIO direct-PHY-register-access
patch and I will give it one more review. Please do not suggest that it
is a fix for this bug (because it's not) -- better yet, just add a Link:
to this thread and explain why you bothered implementing it to begin
with. You can mention that the issue is not seen with direct-access
(which also corroborates our findings here). Then I will base my changes
on your patch.

Alternatively you can drop the patch and we can just fix the indirect
access wholesale - both for SMI and MDIO. That would mean adding less
code (since MDIO with indirect access also works), albeit at the expense
of some technically unnecessary gymnastics in the driver (since MDIO
direct access is simpler). But I'll leave that up to you :-)

What do you think?

Kind regards,
Alvin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ