lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20200818.155824.2292310502481809055.davem@davemloft.net>
Date:   Tue, 18 Aug 2020 15:58:24 -0700 (PDT)
From:   David Miller <davem@...emloft.net>
To:     jwiesner@...e.com
Cc:     netdev@...r.kernel.org, j.vosburgh@...il.com, vfalico@...il.com,
        andy@...yhouse.net, kuba@...nel.org, Andreas.Taschner@...e.com,
        mkubecek@...e.cz
Subject: Re: [PATCH net] bonding: fix active-backup failover for current
 ARP slave

From: Jiri Wiesner <jwiesner@...e.com>
Date: Sun, 16 Aug 2020 20:52:44 +0200

> When the ARP monitor is used for link detection, ARP replies are
> validated for all slaves (arp_validate=3) and fail_over_mac is set to
> active, two slaves of an active-backup bond may get stuck in a state
> where both of them are active and pass packets that they receive to
> the bond. This state makes IPv6 duplicate address detection fail. The
> state is reached thus:
> 1. The current active slave goes down because the ARP target
>    is not reachable.
> 2. The current ARP slave is chosen and made active.
> 3. A new slave is enslaved. This new slave becomes the current active
>    slave and can reach the ARP target.
> As a result, the current ARP slave stays active after the enslave
> action has finished and the log is littered with "PROBE BAD" messages:
>> bond0: PROBE: c_arp ens10 && cas ens11 BAD
> The workaround is to remove the slave with "going back" status from
> the bond and re-enslave it. This issue was encountered when DPDK PMD
> interfaces were being enslaved to an active-backup bond.
> 
> I would be possible to fix the issue in bond_enslave() or
> bond_change_active_slave() but the ARP monitor was fixed instead to
> keep most of the actions changing the current ARP slave in the ARP
> monitor code. The current ARP slave is set as inactive and backup
> during the commit phase. A new state, BOND_LINK_FAIL, has been
> introduced for slaves in the context of the ARP monitor. This allows
> administrators to see how slaves are rotated for sending ARP requests
> and attempts are made to find a new active slave.
> 
> Fixes: b2220cad583c9 ("bonding: refactor ARP active-backup monitor")
> Signed-off-by: Jiri Wiesner <jwiesner@...e.com>

Applied and queued up for -stable, thanks Jiri.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ