lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 23 Aug 2021 17:29:53 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Ido Schimmel <idosch@...sch.org>
Cc:     Nikolay Aleksandrov <nikolay@...dia.com>,
        Vladimir Oltean <vladimir.oltean@....com>,
        netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Roopa Prabhu <roopa@...dia.com>, Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Vadym Kochan <vkochan@...vell.com>,
        Taras Chornyi <tchornyi@...vell.com>,
        Jiri Pirko <jiri@...dia.com>, Ido Schimmel <idosch@...dia.com>,
        UNGLinuxDriver@...rochip.com,
        Grygorii Strashko <grygorii.strashko@...com>,
        Marek Behun <kabel@...ckhole.sk>,
        DENG Qingfang <dqfext@...il.com>,
        Kurt Kanzenbach <kurt@...utronix.de>,
        Hauke Mehrtens <hauke@...ke-m.de>,
        Woojung Huh <woojung.huh@...rochip.com>,
        Sean Wang <sean.wang@...iatek.com>,
        Landen Chao <Landen.Chao@...iatek.com>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        George McCollister <george.mccollister@...il.com>,
        Ioana Ciornei <ioana.ciornei@....com>,
        Saeed Mahameed <saeedm@...dia.com>,
        Leon Romanovsky <leon@...nel.org>,
        Lars Povlsen <lars.povlsen@...rochip.com>,
        Steen Hegelund <Steen.Hegelund@...rochip.com>,
        Julian Wiedmann <jwi@...ux.ibm.com>,
        Karsten Graul <kgraul@...ux.ibm.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Ivan Vecera <ivecera@...hat.com>,
        Vlad Buslov <vladbu@...dia.com>,
        Jianbo Liu <jianbol@...dia.com>,
        Mark Bloch <mbloch@...dia.com>, Roi Dayan <roid@...dia.com>,
        Tobias Waldekranz <tobias@...dekranz.com>,
        Vignesh Raghavendra <vigneshr@...com>,
        Jesse Brandeburg <jesse.brandeburg@...el.com>
Subject: Re: [PATCH v2 net-next 0/5] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
 blocking

On Mon, Aug 23, 2021 at 03:16:48PM +0300, Ido Schimmel wrote:
> I was thinking about the following case:
>
> t0 - <MAC1,VID1,P1> is added in syscall context under 'hash_lock'
> t1 - br_fdb_delete_by_port() flushes entries under 'hash_lock' in
>      response to STP state. Notifications are added to 'deferred' list
> t2 - switchdev_deferred_process() is called in syscall context
> t3 - <MAC1,VID1,P1> is notified as blocking
>
> Updates to the SW FDB are protected by 'hash_lock', but updates to the
> HW FDB are not. In this case, <MAC1,VID1,P1> does not exist in SW, but
> it will exist in HW.
>
> Another case assuming switchdev_deferred_process() is called first:
>
> t0 - switchdev_deferred_process() is called in syscall context
> t1 - <MAC1,VID,P1> is learned under 'hash_lock'. Notification is added
>      to 'deferred' list
> t2 - <MAC1,VID1,P1> is modified in syscall context under 'hash_lock' to
>      <MAC1,VID1,P2>
> t3 - <MAC1,VID1,P2> is notified as blocking
> t4 - <MAC1,VID1,P1> is notified as blocking (next time the 'deferred'
>      list is processed)
>
> In this case, the HW will have <MAC1,VID1,P1>, but SW will have
> <MAC1,VID1,P2>

Ok, so if the hardware FDB entry needs to be updated under the same
hash_lock as the software FDB entry, then it seems that the goal of
updating the hardware FDB synchronously and in a sleepable manner is if
the data path defers the learning to sleepable context too. That in turn
means that there will be 'dead time' between the reception of a packet
from a given {MAC SA, VID} flow and the learning of that address. So I
don't think that is really desirable. So I don't know if it is actually
realistic to do this.

Can we drop it from the requirements of this change, or do you feel like
it's not worth it to make my change if this problem is not solved?

There is of course the option of going half-way too, just like for
SWITCHDEV_PORT_ATTR_SET. You notify it once, synchronously, on the
atomic chain, the switchdev throws as many errors as it can reasonably
can, then you defer the actual installation which means a hardware access.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ