[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZdF04GYACphhBCwl@Laptop-X1>
Date: Sun, 18 Feb 2024 11:09:20 +0800
From: Hangbin Liu <liuhangbin@...il.com>
To: Praveen Kannoju <praveen.kannoju@...cle.com>
Cc: "j.vosburgh@...il.com" <j.vosburgh@...il.com>,
"andy@...yhouse.net" <andy@...yhouse.net>,
"davem@...emloft.net" <davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@...cle.com>,
Rama Nichanamatlu <rama.nichanamatlu@...cle.com>,
Manjunath Patil <manjunath.b.patil@...cle.com>
Subject: Re: [PATCH RFC] bonding: rate-limit bonding driver inspect messages
On Sat, Feb 17, 2024 at 12:39:44PM +0000, Praveen Kannoju wrote:
> > -----Original Message-----
> > From: Hangbin Liu <liuhangbin@...il.com>
> > Sent: 16 February 2024 02:33 PM
> > To: Praveen Kannoju <praveen.kannoju@...cle.com>
> > Cc: j.vosburgh@...il.com; andy@...yhouse.net; davem@...emloft.net; edumazet@...gle.com; kuba@...nel.org;
> > pabeni@...hat.com; netdev@...r.kernel.org; linux-kernel@...r.kernel.org; Rajesh Sivaramasubramaniom
> > <rajesh.sivaramasubramaniom@...cle.com>; Rama Nichanamatlu <rama.nichanamatlu@...cle.com>; Manjunath Patil
> > <manjunath.b.patil@...cle.com>
> > Subject: Re: [PATCH RFC] bonding: rate-limit bonding driver inspect messages
> >
> > On Thu, Feb 15, 2024 at 10:55:54PM +0530, Praveen Kumar Kannoju wrote:
> > > Rate limit bond driver log messages, to prevent a log flood in a
> > > run-away situation, e.g couldn't get rtnl lock. Message flood leads to
> > > instability of system and loss of other crucial messages.
> >
> > Hi Praveen,
> >
> > The patch looks good to me. But would you please help explain why these
> > slave_info() are chosen under net_ratelimit?
> >
> > Thanks
> > Hangbin
>
> Thank you, Hangbin.
>
> The routine bond_mii_monitor() periodically inspects the slave carrier state in order to detect for state changes, on a state change internally records it and does the state change action.
>
> Parked-to-Parked state changes goes through transient state. As an example for Up to Down, BOND_LINK_UP to BOND_LINK_DOWN, is thru BOND_LINK_FAIL. In order to attain next parked state or transient state bond needs rtnl mutex. If in a situation it cannot get it, a state change wouldn't happen. In order to achieve a state change as quickly as possible bond_mii_monitor() reschedules itself to come around after 1 msec.
I think a large miimon downdelay/updelay setting could reduce this.
> And every single come around reinspects the link and sees a state change compared to its internally recorded, which in reality internal state could be not changed earlier as failed to get rtnl lock, and throws again log indicating it sees a state change. If attaining rtnl mutex take long say hypothetical 5 secs, then bond logs 5000 state change message. 1 message at every 1 msec.
Anyway, setting the rate limit do reduce the message flood. Would you please
summarise the paragraph and add it in commit description when post the formal
patch?
thanks
Hangbin
Powered by blists - more mailing lists