netdev - Re: [RFC net-next] net: track locally triggered link loss

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20220521113857.48aeffbb@kernel.org>
Date:   Sat, 21 May 2022 11:38:57 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Saeed Mahameed <saeedm@...dia.com>
Cc:     Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org,
        linux@...linux.org.uk, olteanv@...il.com, hkallweit1@...il.com,
        f.fainelli@...il.com, michael.chan@...adcom.com
Subject: Re: [RFC net-next] net: track locally triggered link loss

On Fri, 20 May 2022 22:08:34 -0700 Saeed Mahameed wrote:
> >> It's impossible from the driver level to know if a FW link event is
> >> due to configuration causes or external forces !  
> >
> >You mean because FW or another entity (other than local host) asked for
> >the link to be reset? How is that different from switch taking it down?
> >Either way the host has lost link due to a non-local event. (3a) or (3b)
> >  
> 
> I was talking about (1) vs (2), how do you know when the IRQ/FW event
> arrives what caused it ?  Maybe I just don't understand how you plan to use the
> new API when re-config brings link down. 
> 
> for example: 
> driver_reconfig() {
>     maybe_close_rings();
>     exec_fw_command(); //link will flap, events are triggered asynchronously.
>     maybe_open_rings();
> }
> 
> how do you wrap this with netif_carrier_change_start/end() when the link
> events are async ? 

Yeah :/ I was worried that we may need to do some queue flushing or
waiting in the _end() to make sure the event has arrived. Remains to 
be seen.

> >> We should keep current carrier logic as is and add new state/counter
> >> to count real phy link state.
> >>
> >> netif_phy_link_down(netdev) {
> >>     set_bit(__LINK_STATE_NOPHYLINK, &dev->state);
> >>     atomic_inc(netdev->phy_link_down);
> >>     netif_carrier_off(ndetdev);
> >> }
> >>
> >> netif_phy_link_up(netdev) {...}
> >>
> >> such API should be maintained by real HW device drivers.  
> >
> >"phy_link_down" has a ring of "API v2 this time we'll get it right".
> >  
> 
> c'mon .. same goes for netif_carrier_local_changes_start/end and
> netif_carrier_admin_off().

Not exactly - admin_off() is different because it tells you local 
vs non-local. That information is not provided by current APIs.

> >Does this differentiate between locally vs non-locally generated events?
> 
> no
> 
> >PTAL at the categorization in the commit message. There are three
> >classes of events, we need three counters. local vs non-local and
> >link went down vs flow was paused by SW are independent and overlapping.
> >Doesn't matter what the counters are called, translating between them
> >is basic math.  
> 
> Ok if you want to go with this I am fine, just let's not add more peppered
> confusing single purpose API calls into drivers to get some counters right,
> let's implement one multi-purpose infrastructure and use it where it's needed.
> It appears that all you need is for the driver to notify the stack when it's
> going down for a re-config and when it comes back up, thus put all the
> interesting heavy lifting logic in the stack, e.g link event classification,
> freezing/flushing TX queues, flushing offloaded resources (VXALN, TLS, IPSec),
> etc .. 
> currently all of the above are and will be duplicated in every driver, when
> all you need is a generic hint from the driver.

How would the core know which resources need to be re-programmed?
This seems hard, we can't get caps properly expressed in most places,
let alone pulling up orchestration into the stack.

Maybe I'm unfair but I can't think of many examples of reworks trying
to pull out management logic out of NIC drivers. All I can think of was
my VXLAN rework. I'm happy to start nacking all the shim APIs which do
little to nothing in the core and just punt all requests to the drivers
or FW :/ I think we veered off course tho...