lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 1 Aug 2020 03:14:54 +0200
From:   Andrew Lunn <andrew@...n.ch>
To:     Asmaa Mnebhi <Asmaa@...lanox.com>
Cc:     David Thompson <dthompson@...lanox.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>, Jiri Pirko <jiri@...lanox.com>
Subject: Re: [PATCH net-next] Add Mellanox BlueField Gigabit Ethernet driver

> > > > +static int mlxbf_gige_mdio_read(struct mii_bus *bus, int phy_add,
> > > > +int
> > >
> > > > +phy_reg) {
> > >
> > > > +         struct mlxbf_gige *priv = bus->priv;
> > >
> > > > +         u32 cmd;
> > >
> > > > +         u32 ret;
> > >
> > > > +
> > >
> > > > +         /* If the lock is held by something else, drop the request.
> > >
> > > > +         * If the lock is cleared, that means the busy bit was cleared.
> > >
> > > > +         */
> > >
> > >
> > >
> > > How can this happen? The mdio core has a mutex which prevents parallel
> > access?
> > >
> > >
> > >
> > > This is a HW Lock. It is an actual register. So another HW entity can
> > > be holding that lock and reading/changing the values in the HW registers.
> > 
> > You have not explains how that can happen? Is there something in the driver
> > i missed which takes a backdoor to read/write MDIO transactions?
> 
> Ah ok! There is a HW entity (called YU) within the BlueField which is connected to the PHY device.
> I think the YU is what you are calling "backdoor" here. The YU contains several registers which control reads/writes
> To the PHY. So it is like an extra layer for reading MDIO registers. One of the YU registers is the gateway register (aka GW or
> MLXBF_GIGE_MDIO_GW_OFFSET in the code). If the GW register's LOCK bit is not cleared, we cannot write anything to the actual PHY MDIO registers.
> Did I answer your question?

Nope.

How can two transactions happen at the same time, causing this lock
bit to be locked? Given that the MDIO core has a mutex and serialises
all transactions. How can the lock bit every be set?

> > > > +         ret = mlxbf_gige_mdio_poll_bit(priv,
> > > > + MLXBF_GIGE_MDIO_GW_LOCK_MASK);
> > >
> > > > +         if (ret)
> > >
> > > > +                       return -EBUSY;
> > >
> > >
> > >
> > > PHY drivers are not going to like that. They are not going to retry.
> > > What is likely to happen is that phylib moves into the ERROR state,
> > > and the PHY driver grinds to a halt.
> > >
> > >
> > >
> > > This is a fairly quick HW transaction. So I don’t think it would cause
> > > and issue for the PHY drivers. In this case, we use the micrel
> > > KSZ9031. We haven’t seen issues.
> > 
> > So you have happy to debug hard to find and reproduce issues when it does
> > happen? Or would you like to spend a little bit of time now and just prevent
> > it happening at all?
> 
> I think I misunderstood your comment. Did you ask why we are polling here? Or that we shouldn't be returning -EBUSY?

I think you should not be returning EBUSY. If it every happens, bad
things will happen.

This lock bit seems to server no purpose. Software will ensure that
transactions are serialized. If it serves no purpose, just ensure it
is unlocked at probe time, and then ignore it. If you ignore it, you
will never return -EBUSY and so bad things will never happen.

Just because hardware exists does not mean you have to use it or that
it adds any value.

       Andrew

Powered by blists - more mailing lists