[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20170525.145040.1004688199988361146.davem@davemloft.net>
Date: Thu, 25 May 2017 14:50:40 -0400 (EDT)
From: David Miller <davem@...emloft.net>
To: nsujir@...tri.com
Cc: netdev@...r.kernel.org, maheshb@...gle.com,
jay.vosburgh@...onical.com
Subject: Re: [PATCH] bonding: Don't update slave->link until ready to commit
From: Nithin Nayak Sujir <nsujir@...tri.com>
Date: Wed, 24 May 2017 19:45:17 -0700
> In the loadbalance arp monitoring scheme, when a slave link change is
> detected, the slave->link is immediately updated and slave_state_changed
> is set. Later down the function, the rtnl_lock is acquired and the
> changes are committed, updating the bond link state.
>
> However, the acquisition of the rtnl_lock can fail. The next time the
> monitor runs, since slave->link is already updated, it determines that
> link is unchanged. This results in the bond link state permanently out
> of sync with the slave link.
>
> This patch modifies bond_loadbalance_arp_mon() to handle link changes
> identical to bond_ab_arp_{inspect/commit}(). The new link state is
> maintained in slave->new_link until we're ready to commit at which point
> it's copied into slave->link.
>
> NOTE: miimon_{inspect/commit}() has a more complex state machine
> requiring the use of the bond_{propose,commit}_link_state() functions
> which maintains the intermediate state in slave->link_new_state. The arp
> monitors don't require that.
>
> Testing: This bug is very easy to reproduce with the following steps.
> 1. In a loop, toggle a slave link of a bond slave interface.
> 2. In a separate loop, do ifconfig up/down of an unrelated interface to
> create contention for rtnl_lock.
> Within a few iterations, the bond link goes out of sync with the slave
> link.
>
> Signed-off-by: Nithin Nayak Sujir <nsujir@...tri.com>
Applied, thank you.
Powered by blists - more mailing lists