lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220716123608.chdzbvpinso546oh@skbuf>
Date:   Sat, 16 Jul 2022 15:36:08 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     "Russell King (Oracle)" <linux@...linux.org.uk>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        Alvin __ipraga <alsi@...g-olufsen.dk>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Daniel Scally <djrscally@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        DENG Qingfang <dqfext@...il.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        George McCollister <george.mccollister@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Hauke Mehrtens <hauke@...ke-m.de>,
        Heikki Krogerus <heikki.krogerus@...ux.intel.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Kurt Kanzenbach <kurt@...utronix.de>,
        Landen Chao <Landen.Chao@...iatek.com>,
        Linus Walleij <linus.walleij@...aro.org>,
        linux-acpi@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-mediatek@...ts.infradead.org,
        Matthias Brugger <matthias.bgg@...il.com>,
        netdev@...r.kernel.org, Paolo Abeni <pabeni@...hat.com>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Sakari Ailus <sakari.ailus@...ux.intel.com>,
        Sean Wang <sean.wang@...iatek.com>,
        UNGLinuxDriver@...rochip.com,
        Vivien Didelot <vivien.didelot@...il.com>,
        Woojung Huh <woojung.huh@...rochip.com>,
        Marek BehĂșn <kabel@...nel.org>
Subject: Re: [PATCH net-next 3/6] net: dsa: add support for retrieving the
 interface mode

On Sat, Jul 16, 2022 at 12:13:55PM +0100, Russell King (Oracle) wrote:
> > > and is exactly what I'm trying to get rid of, so we have _consistency_
> > > in the implementation, to prevent fuckups like I've created by
> > > converting many DSA drivers to use phylink_pcs. Any DSA driver that
> > > used a PCS for the DSA or CPu port and has been converted to
> > > phylink_pcs support has been broken in the last few kernel cycles. I'm
> > > trying to address that breakage before converting the Marvell DSA
> > > driver - which is the driver that highlighted the problem.
> > 
> > You are essentially saying that it's of no use to keep in DSA the
> > fallback logic of not registering with phylink, because the phylink_pcs
> > conversions have broken the defaulting functionality already in all
> > other drivers.
> 
> Correct, and I don't want these exceptions precisely because it creates
> stupid mistakes like the one I've highlighted. If we have one way of
> doing something, then development becomes much easier. When we have
> multiple different ways, mistakes happen, stuff breaks.

Agree that when we have multiple different ways, mistakes happen and
stuff breaks. This is also why I want to avoid the proliferation of the
default_interface reporting when most of the drivers were just fine
without it. It needs to be opt-in, and checking for the presence of a
dedicated function is an easy way to check for that opt-in.

> > I may have missed something, but this is new information to me.
> > Specifically, before you've said that it is *this* patch set which would
> > risk introducing breakage (by forcing a link down + a phylink creation).
> > https://lore.kernel.org/netdev/YsCqFM8qM1h1MKu%2F@shell.armlinux.org.uk/
> > What you're saying now directly contradicts that.
> 
> No, it is not contradictory at all.
> 
> There is previous breakage caused by converting DSA drivers to
> phylink_pcs - because the PCS code will *not* be called where a driver
> makes use of the "default-to-fastest-speed" mechanism, so is likely
> broken. Some drivers work around this by doing things manually.

Marvell conversion to phylink_pcs doesn't count as "previous" breakage
if it's not in net-next.

> This series *also* risks breakage, because it means phylink gets used
> in more situations which could confuse drivers - such as those which
> have manually brought stuff up and are not expecting phylink to also
> do it. Or those which do not fill in the mac_capabilities but do
> default to this "fastest-speed" mechanism. Or some other unforseen
> scenario.

I don't exactly understand the mechanics through which a phylink_pcs
conversion would break a driver that used defaulting behavior,
but I can only imagine it involves moving code that didn't depend on
phylink, to phylink callbacks. It's hard to say whether that happened
for any of the phylink_pcs conversions in net-next.

But drivers could also have their CPU port working simply because those
are internal to an SoC and don't need any software configuration to pass
traffic. In their case, there is no breakage caused by the phylink_pcs
conversion, but breakage caused by sudden registration of phylink is
plausible, if phylink doesn't get the link parameters right.

And that breakage is preventable. Gradually more drivers could be
converted to create a fixed-link software node by printing a warning
that they should, and still keep the logic to avoid phylink registration
and putting the respective port down. Driver authors might not be very
responsive to RFC patch sets, but they do look at new warnings in dmesg
and try to see what they're about.

What I'm missing is the proof that the phylink_pcs conversion has broken
those kinds of switches, and that it's therefore pointless to keep the
existing logic for them. Sorry, but you didn't provide it.

> So there's known breakage - which we know because we've tripped over
> the issue with mv88e6xxx pcs conversion which isn't in net-next yet,
> but illustrates the issue, and there's unknown breakage through
> applying this patch and not having had test feedback from all the
> DSA driver authors.
> 
> There are no contradiction there what so ever.
> 
> > Do you have concrete evidence that there is actually any regression of
> > this kind introduced by prior phylink_pcs conversions? Because if there
> > is, I retract the proposal to keep the fallback logic.
> 
> Since we have no idea which DSA drivers make use of this "default-to-
> fastest-speed", and Andrew who has been promoting this doesn't seem to
> know which drivers make use of this, I do not, but we know that the
> breakage does occur if someone does make use of this with one of the
> converted DSA drivers through the behaviour of mv88e6xxx. Just because
> no one has reported it yet does not mean it doesn't exist. Not everyone
> updates their kernels each time Linus releases a final kernel version,
> especially in the embedded world. Sometimes it can take until a LTS
> release until people move forward, which we think will be 5.21.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ