netdev - Re: [RFC PATCH net-next v3 0/4] net: Introduce IFF_PROTO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACcJQnSmQQtZoPAEOwAGsvaLHzb4_fU2dxrgAXDXAjHsgkaFhA@mail.gmail.com>
Date:	Wed, 29 Apr 2015 15:04:09 -0700
From:	Anuradha Karuppiah <anuradhak@...ulusnetworks.com>
To:	Scott Feldman <sfeldma@...il.com>
Cc:	"David S. Miller" <davem@...emloft.net>,
	Netdev <netdev@...r.kernel.org>,
	Roopa Prabhu <roopa@...ulusnetworks.com>,
	Andy Gospodarek <gospo@...ulusnetworks.com>,
	Wilson Kok <wkok@...ulusnetworks.com>
Subject: Re: [RFC PATCH net-next v3 0/4] net: Introduce IFF_PROTO_DOWN flag.

On Tue, Apr 28, 2015 at 5:28 PM, Scott Feldman <sfeldma@...il.com> wrote:
> On Tue, Apr 28, 2015 at 1:04 PM, Anuradha Karuppiah
> <anuradhak@...ulusnetworks.com> wrote:
>> On Tue, Apr 28, 2015 at 12:37 PM, Scott Feldman <sfeldma@...il.com> wrote:
>>> On Tue, Apr 28, 2015 at 8:39 AM, Anuradha Karuppiah
>>> <anuradhak@...ulusnetworks.com> wrote:
>>>>
>>>>
>>>> On Mon, Apr 27, 2015 at 10:45 PM, Scott Feldman <sfeldma@...il.com> wrote:
>>>>>
>>>>> On Mon, Apr 27, 2015 at 10:38 AM,  <anuradhak@...ulusnetworks.com> wrote:
>>>>> > From: Anuradha Karuppiah <anuradhak@...ulusnetworks.com>
>>>>> >
>>>>> > User space daemons can detect errors in the network that need to be
>>>>> > notified to the switch device drivers.
>>>>> >
>>>>> > Drivers can react to this error state by doing a phy-down on the
>>>>> > switch-port which would result in a carrier-off locally and on the
>>>>> > directly connected switch. Doing that would prevent loops and
>>>>> > black-holes in the network.
>>>>>
>>>>> (Sorry if this was asked earlier)
>>>>>
>>>>> Can the application simply send a SETLINK with IFF_UP clear and the
>>>>> port driver's ndo_stop would bring the PHY link down?
>>>>
>>>>
>>>> Yes, Clearing IFF_UP on detecting errors (PROTO_DOWN) is possible and we
>>>> tried
>>>> that implementation as well. Unfortunately it failed because of the
>>>> following
>>>> reasons -
>>>>
>>>> 1. There is no way to disambiguate between admin_down (!IFF_UP) and an
>>>> APP/driver enforced error_down (IFF_PROTO_DOWN). Administrator or
>>>> automation-scripts that monitor the config assumed that switch-port
>>>> configuration had somehow fallen out of sync (and attempted to reinstate the
>>>> admin_up repeatedly).
>>>>
>>>> 2. Automatic error recovery was not possible; consider the following
>>>> scenario
>>>> for e.g.
>>>>    a. The MLAG peer-link is down so the MLAG app on the secondary switch has
>>>>       proto_down’ed all the MLAG ports (including switch-port swp1) by
>>>> clearing
>>>>       IFF_UP.
>>>>    b. At the same time the administrator is in the process of making some
>>>>       changes on the network connected to swp1. To avoid doing it live he
>>>> would
>>>>       admin_disable swp1 (!IFF_UP) by doing an "ip link set swp1 down" (this
>>>>       is a no-op as event #a has already cleared IFF_UP on swp1).
>>>>    c. If the MLAG peer-link recovers at this point the MLAG app on the
>>>>       secondary switch would try to automatically recover the MLAG ports
>>>>       by clearing proto_down (i.e. setting IFF_UP); including on swp1. Doing
>>>>       that overrides the administrator’s directive to keep swp1 admin_down.
>>>>       Overriding an admin-down in a live network can be very dangerous so it
>>>>       is not possible to do auto-error-recovery unless we have a way to
>>>>       disambiguate between the admin and error states
>>>
>>> That makes sense.
>>>
>>> Dang, this is so close to IFF_DORMANT.  The interface can be IFF_UP
>>> and link mode can be DORMANT.  Can the port driver kill PHY link if
>>> dev->flags&IFF_DORMANT in ndo_set_rx_mode()?  Would require
>>> IFF_DORMANT is included in dev->flags in __dev_change_flags().
>>
>> Yes, IFF_DORMANT does seem close to what is needed; in the current/standard
>> interpretation IFF_DORMANT keeps the switch port phy-up and running (and most
>> PDUs are also exchanged in the dormant state). Like you said we could
>> re-interpret IFF_DORMANT in this context to phy-down the switch-port;
>> unfortunately we are already using IFF_DORMANT as well (in its standard
>> interpretation)...
>
> That makes sense; best to not confuse IFF_DORMANT with this new need.
>
>> We are using the dormant mode (for the MLAG app itself) to hold the MLAG port
>> in a brief/transition-ary suspended state when the switch-port link/carrier up
>> happens. This has been done to co-ordinate states across the MLAG peer switches
>> and to ensure that egress port block masks are programmed on the peer switch
>> before transitioning the local switch port to an OPER_UP state. If we didn't do
>> that the dual-connected server would see duplicate packets every time a
>> link-down to link-up happened on a MLAG port.
>
> How can we see this in action?  I didn't find where the kernel egress
> blocks the port when dormant.  What are the requirements for a kernel
> port driver to support your MLAG app?  Is this MLAG app available
> somewhere?

Traffic forwarding on local dormant switch ports is being done implicitly by
using MSTP which puts !OPER_UP (OPER_DOWN, OPER_DORMANT) ports in an STP
disabled/blocking state. Egress traffic blocking is really needed on the peer
switch to prevent the traffic from the peer link being sent again to
the server.We are using
ebtables for this purpose currently so there are no additional kernel
requirements.

Further details on the MLAG app can also be found at -
https://www.netdev01.org/sessions/23

We are actively working on consolidating the MLAG app and making it available
for everybody's use soon. Getting proto_down out was part of that process.

PROTO_DOWN also has other use cases - like a link-dampening app which can
monitor (and proto_down) flapping or otherwise-misbehaving switch ports and
attempt paced/periodic auto-recovery.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html