lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 8 Aug 2008 11:43:59 -0700
From:	"Matt Carlson" <mcarlson@...adcom.com>
To:	"Benjamin Herrenschmidt" <benh@...nel.crashing.org>
cc:	"Segher Boessenkool" <segher@...nel.crashing.org>,
	mcarlson@...adcom.com,
	"linuxppc-dev list" <linuxppc-dev@...abs.org>,
	netdev <netdev@...r.kernel.org>, "Nathan Lynch" <ntl@...ox.com>,
	"Michael Chan" <mchan@...adcom.com>
Subject: Re: Strange tg3 regression with UMP fw. link reporting

On Fri, Aug 08, 2008 at 07:18:31PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2008-08-08 at 10:58 +0200, Segher Boessenkool wrote:
> > > I don't know yet for sure what happens, but a quick look at the commit
> > > seems to show that the driver synchronously spin-waits for up to 2.5ms
> > 
> > That's what the comment says, but the code says 2.5 _seconds_:
> > 
> > +       /* Wait for up to 2.5 milliseconds */
> > +       for (i = 0; i < 250000; i++) {
> > +               if (!(tr32(GRC_RX_CPU_EVENT) & GRC_RX_CPU_DRIVER_EVENT))
> > +                       break;
> > +               udelay(10);
> > +       }
> > 
> > (not that milliseconds wouldn't be bad already...)
> 
> Right, indeed. I think we have a good candidate for the problem :-) I'll
> verify that on monday. Now, that leads to two questions:
> 
>  - What such a synchronous and potentially horribly slow code is
> going in a locked section or a timer interrupts ? Ie, the link
> watch should probably move to a workqueue if that is to remain,
> or the code turned into a state machine that periodically check for
> events, or whatever is more sane than the above.
> 
>  - The code should at least display some error and do something sane in
> case of timeout such as disabling the new UMP feature instead of
> repeatedly looping ...
> 
>  - If this is indeed our problem (timing out in the code above), why is
> our firmware not emitting the requested event -> maybe the PowerStations
> need a tg3 firmware update.
> 
> Matt, what's your take on this ?

Segher is right.  The code should be 2.5 milliseconds but is actually
much longer.  This fix is actually already in my patch queue and needs
to be sent upstream.

We really shouldn't be displaying any error messages in the event of a
timeout though.  Earlier versions of the UMP firmware did not support
the link update interface.  The best thing the driver can do for all
cases is give the firmware a chance to service the event but continue
as if the event were serviced if it did not get an explicit ACK.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ