lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 28 Jun 2024 17:55:35 +0800
From: Hangbin Liu <liuhangbin@...il.com>
To: Nikolay Aleksandrov <razor@...ckwall.org>
Cc: Jay Vosburgh <jay.vosburgh@...onical.com>,
	Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
	Andy Gospodarek <andy@...yhouse.net>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	Ido Schimmel <idosch@...dia.com>, Jiri Pirko <jiri@...nulli.us>,
	Amit Cohen <amcohen@...dia.com>
Subject: Re: [PATCHv3 net-next] bonding: 3ad: send ifinfo notify when mux
 state changed

Hi Nikolay,
On Fri, Jun 28, 2024 at 10:22:25AM +0300, Nikolay Aleksandrov wrote:
> > Actually I was talking about:
> >  /sys/class/net/<bond port>/bonding_slave/ad_actor_oper_port_state
> >  /sys/class/net/<bond port>/bonding_slave/ad_partner_oper_port_state
> > etc
> > 
> > Wouldn't these work for you?
> > 
> 
> But it gets much more complicated, I guess it will be easier to read the
> proc bond file with all the LACP information. That is under RCU only as
> well.

Good question. The monitor application want a more elegant/general way
to deal with the LACP state and do other network reconfigurations.
Here is the requirement I got from customer.

1) As a server administrator, I want ip monitor to show state change events
   related to LACP bonds so that I can react quickly to network reconfigurations.
2) As a network monitoring application developer, I want my application to be
   notified about LACP bond operational state changes without having to
   poll /proc/net/bonding/<bond> and parse its output so that it can trigger
   predefined failover remediation policies.
3) As a server administrator, I want my LACP bond monitoring application to
   receive a Netlink-based notification whenever the number of member
   interfaces is reduced so that the operations support system can provision
   a member interface replacement.

What I understand is the user/admin need to know the latest stable state so
they can do some other network configuration based on the status. Losing
a middle state notification during fast changes is acceptable.

> Well, you mentioned administrators want to see the state changes, please
> better clarify the exact end goal. Note that technically may even not be
> the last state as the state change itself happens in parallel (different
> locks) and any update could be delayed depending on rtnl availability
> and workqueue re-scheduling. But sure, they will get some update at some point. :)

Would you please help explain why we may not get the latest state? From what
I understand:

1) State A -> B, queue notify
       rtnl_trylock, fail, queue again
2) State B -> C, queue notify
      rtnl_trylock, success, post current state C
3) State C -> D, queue notify
      rtnl_trylock, fail, queue again
4) State D -> A, queue notify
      rtnl_trylock, fail, queue again
      rtnl_trylock, fail, queue again
      rtnl_trylock, success, post current state A

So how could the step 3) state send but step 4) state not send?

BTW, in my code, I should set the should_notify_lacp = 0 first before sending
ifinfo message. So that even the should_notify_lacp = 1 in ad_mux_machine()
is over written here, it still send the latest status.

> +
> +             if (slave->should_notify_lacp) {
> +                     slave->should_notify_lacp = 0;
> +                     rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_KERNEL, 0, NULL);
> +             }

The side effect is that we may send 2 same latest lacp status(the
should_notify_lacp is over written to 1 and queue again), which should
be OK.

Thanks
Hangbin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ