lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yv6z5HTyenpJ+pex@lunn.ch>
Date:   Thu, 18 Aug 2022 23:49:24 +0200
From:   Andrew Lunn <andrew@...n.ch>
To:     Vladimir Oltean <vladimir.oltean@....com>
Cc:     netdev@...r.kernel.org, Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vladimir Oltean <olteanv@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Kevin Hilman <khilman@...nel.org>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        Len Brown <len.brown@...el.com>, Pavel Machek <pavel@....cz>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [RFC PATCH net-next 00/10] Use robust notifiers in DSA

> I am posting this as RFC because something still feels off, but I can't
> exactly pinpoint what, and I'm looking for some feedback. Since most DSA
> switches are behind I/O protocols that can fail or time out (SPI, I2C,
> MDIO etc), everything can fail; that's a fact. On the other hand, when
> a network device or the entire system is torn down, nobody cares that
> SPI I/O failed - the system is still shutting down; that is also a fact.
> I'm not quite sure how to reconcile the two. On one hand we're
> suppressing errors emitted by DSA drivers in the non-robust form of
> notifiers, and on the other hand there's nothing we can do about them
> either way (upper layers don't necessarily care).

I would split it into two classes of errors:

Bus transactions fail. This very likely means the hardware design is
bad, connectors are loose, etc. There is not much we can do about
this, bad things are going to happen no what.

We have consumed all of some sort of resource. Out of memory, the ATU
is full, too many LAGs, etc. We try to roll back in order to get out
of this resource problem.

So i would say -EIO, -ETIMEDOUT, we don't care about too
much. -ENOMEM, -ENOBUF, -EOPNOTSUPP or whatever, we should try to do a
robust rollback.

The original design of switchdev was two phase:

1) Allocate whatever resources are needed, can fail
2) Put those resources into use, must not fail

At some point that all got thrown away.

	Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ