lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120625124045.GA30571@gmail.com>
Date:	Mon, 25 Jun 2012 14:40:45 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Cliff Wickman <cpw@....com>
Cc:	linux-kernel@...r.kernel.org, x86@...nel.org,
	Jack Steiner <steiner@....com>, Mike Travis <travis@....com>
Subject: Re: [PATCH 3/3] x86: UV2 BAU hang workarounds


* Cliff Wickman <cpw@....com> wrote:

> On Mon, Jun 25, 2012 at 12:03:21PM +0200, Ingo Molnar wrote:
> > 
> > * Cliff Wickman <cpw@....com> wrote:
> > 
> > > On SGI's UV2 the BAU (Broadcast Assist Unit) driver can hang under a
> > > heavy load.  To cure this: 
> > > 
> > > - Disable the UV2 extended status mode (see UV2_EXT_SHFT), as this
> > >   mode changes BAU behavior in more ways then just delivering an extra bit
> > >   of status.  Revert status to just two meaningful bits, like UV1.
> > > - Use no IPI-style resets on UV2.  Just give up the request for whatever the
> > >   reason it failed and let it be accomplished with the legacy IPI method.
> > > - Use no alternate sending descriptor (the former UV2 workaround
> > >   bcp->using_desc and handle_uv2_busy() stuff).  Just disable the use of the
> > >   BAU for a period of time in favor of the legacy IPI method when the h/w bug
> > >   leaves a descriptor busy.
> > >   -- new tunable: giveup_limit determines the threshold at which a hub is 
> > >      so plugged that it should do all requests with the legacy IPI method for a
> > >      period of time
> > >   -- generalize disable_for_congestion() (renamed disable_for_period()) for
> > >      use whenever a hub should avoid using the BAU for a period of time
> > > 
> > > Misc:
> > > - fix find_another_by_swack(), which is part of the UV2 bug workaround
> > > - correct and clarify the statistics (new stats s_overipilimit s_giveuplimit
> > >   s_enters s_ipifordisabled s_plugged s_congested)
> > 
> > Sigh, it looks like something that ought to be 7 successive, 
> > easy to review commits got mixed up into a single, huge, hard to 
> > review commit. How did that happen?
> > 
> > Thanks,
> > 
> > 	Ingo
> 
> Hi Ingo,
> 
>   Yes, admittedly large.
>   This patch was the 'bottom line' of a great deal of experimentation on
>   how to work around some hardware problems with the bau.  This is what
>   remains after pulling out the unnecessary or unhelpful attempts.

Ok - this happens sometimes.

>   I could break it up for review purposes, if you think anyone would
>   want to examine each component.
>   You sound like you're willing to spend that time and effort.  Yes?

I had a look already and it didn't look fundamentally 
objectionable - besides its size. As long as it wasn't actually 
the result of merging multiple patches I'll apply it to 
tip:x86/uv. If there's problem with the patch we could still 
break it up and re-try.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ