lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <259f9a02-ad13-838e-00e2-4742f6a69a86@redhat.com>
Date:   Tue, 27 Nov 2018 13:12:08 -0500
From:   Jonathan Toppins <jtoppins@...hat.com>
To:     Toni Peltonen <peltzi@...tzi.fi>, netdev@...r.kernel.org
Cc:     jay.vosburgh@...onical.com
Subject: Re: [PATCH v2 net] bonding: fix 802.3ad state sent to partner when
 unbinding slave

On 11/27/2018 09:56 AM, Toni Peltonen wrote:
> Previously when unbinding a slave the 802.3ad implementation only told
> partner that the port is not suitable for aggregation by setting the port
> aggregation state from aggregatable to individual. This is not enough. If the
> physical layer still stays up and we only unbinded this port from the bond there
> is nothing in the aggregation status alone to prevent the partner from sending
> traffic towards us. To ensure that the partner doesn't consider this
> port at all anymore we should also disable collecting and distributing to
> signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to
> ensure partner exits collecting + distributing state.
> 
> I have tested this behaviour againts Arista EOS switches with mlx5 cards
> (physical link stays up even when interface is down) and simulated
> the same situation virtually Linux <-> Linux with two network namespaces
> running two veth device pairs. In both cases setting aggregation to
> individual doesn't alone prevent traffic from being to sent towards this
> port given that the link stays up in partners end. Partner still keeps
> it's end in collecting + distributing state and continues until timeout is
> reached. In most cases this means we are losing the traffic partner sends
> towards our port while we wait for timeout. This is most visible with slow
> periodic time (LACP rate slow).
> 
> Other open source implementations like Open VSwitch and libreswitch, and
> vendor implementations like Arista EOS, seem to disable collecting +
> distributing to when doing similar port disabling/detaching/removing change.
> With this patch kernel implementation would behave the same way and ensure
> partner doesn't consider our actor viable anymore.
> 
> Signed-off-by: Toni Peltonen <peltzi@...tzi.fi>
> ---
> v2 changes:
> * Fix typo in commit message
> * Also clear AD_STATE_SYNCHRONIZATION
> ---
>  drivers/net/bonding/bond_3ad.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> index f43fb2f958a5..93dfcef8afc4 100644
> --- a/drivers/net/bonding/bond_3ad.c
> +++ b/drivers/net/bonding/bond_3ad.c
> @@ -2086,6 +2086,9 @@ void bond_3ad_unbind_slave(struct slave *slave)
>  		   aggregator->aggregator_identifier);
>  
>  	/* Tell the partner that this port is not suitable for aggregation */
> +	port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION;

required by IEEE std 801.AX 2014; figure 6-18 port disabled state

> +	port->actor_oper_port_state &= ~AD_STATE_COLLECTING;
> +	port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING;

required by IEEE std 801.AX 2014; figure 6-21 detached state

>  	port->actor_oper_port_state &= ~AD_STATE_AGGREGATION;>  	__update_lacpdu_from_port(port);
>  	ad_lacpdu_send(port);
> 

Acked-by: Jonathan Toppins <jtoppins@...hat.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ