lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Feb 2022 08:16:44 +0000
From:   Alvin Šipraga <ALSI@...g-olufsen.dk>
To:     Luiz Angelo Daros de Luca <luizluca@...il.com>
CC:     Vladimir Oltean <olteanv@...il.com>,
        Alvin Šipraga <alvin@...s.dk>,
        Linus Walleij <linus.walleij@...aro.org>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Arınç ÜNAL <arinc.unal@...nc9.com>,
        Michael Rasmussen <MIR@...g-olufsen.dk>,
        "open list:NETWORKING DRIVERS" <netdev@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next 2/2] net: dsa: realtek: rtl8365mb: serialize
 indirect PHY register access

Luiz Angelo Daros de Luca <luizluca@...il.com> writes:

> Hi Vladimir,
>
>> This implementation where the indirect PHY access blocks out every other
>> register read and write is only justified if you can prove that you can
>> stuff just about any unrelated register read or write before
>> RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG, and this, in and of itself,
>> will poison what gets read back from RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG.
>
> I was the first one trying to fix this issue reported by Arinç with
> SMP devices. At first I thought it was caused by two parallel indirect
> access reads polling the interface (it was not using interrupts). With
> no lock, they will eventually collide and one reads the result of the
> other one. However, a simple lock over the indirect access didn't
> solve the issue. Alvin tested it much further to isolate that indirect
> register access is messed up by any other register read. The fails
> while polling the interface status or the other test Alvin created
> only manifests in a device with multiple cores and mine is single
> core. I do get something similar in a single core device by reading an
> unused register address but it is hard to blame Realtek when we are
> doing something we were not supposed to do. Anyway, that indicates
> that "reading a register" is not an atomic operation inside the switch
> asic.

I never observed any issue which suggests that switch register reads are
not atomic... I mean, they are (and always have been) protected by the
default regmap lock. So what makes you say this?

I have only seen issues related to PHY register access, please enlighten
us if there are other issues.

>
>> rtl8365mb_mib_counter_read() doesn't seem like a particularly good
>> example to prove this, since it appears to be an indirect access
>> procedure as well. Single register reads or writes would be ideal, like
>> RTL8365MB_CPU_CTRL_REG, artificially inserted into strategic places.
>> Ideally you wouldn't even have a DSA or MDIO or PHY driver running.
>> Just a simple kernel module with access to the regmap, and try to read
>> something known, like the PHY ID of one of the internal PHYs, via an
>> open-coded function. Then add extra regmap accesses and see what
>> corrupts the indirect PHY access procedure.
>
> The MIB might be just another example where the issue happens. It was
> first noticed with a SMP device without interruptions configured. I
> believe it will always fail with that configuration.

As I stated in the last thread, I tested MIB access and the problem did
not manifest itself there.

>
>> Are Realtek aware of this and do they confirm the issue? Sounds like
>> erratum material to me, and a pretty severe one, at that. Alternatively,
>> we may simply not be understanding the hardware architecture, like for
>> example the fact that MIB indirect access and PHY indirect access may
>> share some common bus and must be sequential w.r.t. each other.
>
> The realtek "API/driver" does exactly how the driver was doing. They
> do have a lock/unlock placeholder, but only in the equivalent
> regmap_{read,write} functions. Indirect access does not use locks at
> all (in fact, there is no other mention of "lock" elsewhere), even
> being obvious that it is not thread-safe. It was just with a DSA
> driver that we started to exercise register access for real, specially
> without interruptions. And even in that case, we could only notice
> this issue in multicore devices. I believe that, if they know about
> this issue, they might not be worried because it has never affected a
> real device. It would be very interesting to hear from Realtek but I
> do not have the contacts.

This is not true, at least with the sources I am reading. As I said in
my reply to Vladimir, the Realtek code takes a lock around each
top-level API call. Example:

rtk_api_ret_t rtk_port_phyStatus_get(...)
{
    rtk_api_ret_t retVal;

    if (NULL == RT_MAPPER->port_phyStatus_get)
        return RT_ERR_DRIVER_NOT_FOUND;

    RTK_API_LOCK();
    retVal = RT_MAPPER->port_phyStatus_get(port, pLinkStatus, pSpeed, pDuplex);
    RTK_API_UNLOCK();

    return retVal;
}

Deep down in this port_phyStatus_get() callback, the indirect PHY
register access takes place.

Kind regards,
Alvin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ