lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4976cd2b-d782-4b01-8957-133d1b37a9c8@default>
Date:   Thu, 25 Oct 2018 23:49:51 -0700 (PDT)
From:   Manish Kumar Singh <mk.singh@...cle.com>
To:     Michal Kubecek <mkubecek@...e.cz>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        Mahesh Bandewar (महेश बंडेवार) 
        <maheshb@...gle.com>, linux-netdev <netdev@...r.kernel.org>,
        Jay Vosburgh <j.vosburgh@...il.com>,
        Veaceslav Falico <vfalico@...il.com>,
        Andy Gospodarek <andy@...yhouse.net>,
        "David S. Miller" <davem@...emloft.net>,
        linux-kernel@...r.kernel.org
Subject: RE: [PATCH] bonding:avoid repeated display of same link status change



> -----Original Message-----
> From: Michal Kubecek [mailto:mkubecek@...e.cz]
> Sent: 25 अक्तूबर 2018 14:59
> To: Manish Kumar Singh
> Cc: Eric Dumazet; Mahesh Bandewar (महेश बंडेवार); linux-netdev; Jay
> Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@...r.kernel.org
> Subject: Re: [PATCH] bonding:avoid repeated display of same link status
> change
> 
> On Thu, Oct 25, 2018 at 02:21:05AM -0700, Manish Kumar Singh wrote:
> > > From: Michal Kubecek [mailto:mkubecek@...e.cz]
> > > IMHO it does not. AFAICS multiple instances of bond_mii_monitor()
> cannot
> > > run simultaneously for the same bond so that there doesn't seem to be
> > > anything to collide with. (And if they could, we would need to test and
> > > set the flag atomically in bond_miimon_inspect().)
> > >
> > Yes, Michal, we are inline with your understanding.
> > when the -original- patch was posted to upstream there was no
> > -synchronization- nor -racing- addressing code was in read/write of this
> > added filed, as we -never- saw need for either.
> >
> > -only- writer of the added field is bond_mii_monitor.
> > -only- reader of the added field is bond_miimon_inspect.
> > -this writer & reader -never- can run concurrently.
> > -writer invokes the reader.
> >
> > hence, imo uint_8 rtnl_needed is all what is needed; with
> bond_mii_monitor doing rtnl_needed = 1; and bond_miimon_inspect doing
> if rtnl_needed.
> >
> > here is the gravity of the situation with multiple customers whose names
> including machine names redacted:
> >
> >  4353 May 31 02:38:57 hostname kernel: ixgbe 0000:03:00.0: removed PHC
> on p2p1
> >  4354 May 31 02:38:57 hostname kernel: public: link status down for active
> interface p2p1, disabling it in 100 ms
> >  4355 May 31 02:38:57 hostname kernel: public: link status down for active
> interface p2p1, disabling it in 100 ms
> >  4356 May 31 02:38:57 hostname kernel: public: link status definitely down
> for interface p2p1, disabling it
> >  4357 May 31 02:38:57 hostname kernel: public: making interface p2p2 the
> new active one
> >  4358 May 31 02:38:59 hostname kernel: ixgbe 0000:03:00.0: registered PHC
> device on p2p1
> >  4359 May 31 02:39:00 hostname kernel: ixgbe 0000:03:00.0 p2p1: NIC Link is
> Up 10 Gbps, Flow Control: RX/TX
> >  4360 May 31 02:39:00 hostname kernel: public: link status up for interface
> p2p1, enabling it in 200 ms
> >  4361 May 31 02:39:00 hostname kernel: public: link status definitely up for
> interface p2p1, 10000 Mbps full duplex
> >  4362 May 31 02:45:37 hostname journal: Missed 217723 kernel messages
> >  4363 May 31 02:45:37 hostname kernel: public: link status down for active
> interface p2p2, disabling it in 100 ms
> >  	---------------------
> >                       11000+ APPROX SAME REPEATED MESSAGES in second
> > 	---------------------
> > 15877 May 31 02:45:37 hostname kernel: public: link status down for active
> interface p2p2, disabling it in 100 ms
> > 15878 May 31 02:45:37 hostname kernel: public: link status definitely down
> for interface p2p2, disabling it
> > 15879 May 31 02:45:37 hostname kernel: public: making interface p2p1 the
> new active one
> 
> When I was replying, I didn't know this was a v2 and I haven't seen the
> v1 discussion. I have read it since and I think I understand Eric's
> point now. The thing is that just adding e.g. u8 is OK as it is now.
> However, someone could later add another u8 next to it which would also
> be perfectly OK on its own but reads/writes to these two could collide
> between each other.
> 
> And as pointed out by a colleague, even having atomic_t and u8 flag in
> one 64-bit word could be a problem on architectures which cannot do an
> atomic read/write from/to a 32-bit word (sparc seems to be one).

Thanks Michal for explaining it, now we understand the problem what Eric was referring to in v1 of the patch.
I could think of fixing it in 3 ways, Please suggest which one would be safe and optimal fix:

1. Use type unit64_t for rtnl_needed .
2. Use type atomic64_t for rtnl_needed and atomic64_set/read.
3. Use type uint64_t for rtnl_needed with spinlock protection.

I think option 3 would be overkill keeping in mind the frequency of bond_mii_monitor.

Thanks,
Manish
> 
> Michal Kubecek

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ