lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1003251441420.3147@localhost.localdomain>
Date:	Thu, 25 Mar 2010 15:16:10 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Andi Kleen <andi@...stfloor.org>
cc:	x86@...nel.org, LKML <linux-kernel@...r.kernel.org>,
	jesse.brandeburg@...el.com,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] Prevent nested interrupts when the IRQ stack is near
 overflowing v2

On Thu, 25 Mar 2010, Andi Kleen wrote:
> > > > > Anyways if such a thing was done it would be a long term project
> > > > > and that short term fix would be still needed.
> > > > 
> > > > Your patch is not a fix, It's a lousy, horrible and unreliable
> > > > workaround. It's not fixing the root cause of the problem at hand.
> > > 
> > > It fixes the bug in a minimally intrusive way.
> > 
> > It papers over the problem. We already know that the NIC driver floods
> > the machine with interrupts, so why are you insisting that we need to
> 
> Well in this case it's simply because it has 4 ports and they are all
> active and have a lot of MSI-X vectors for each stream. 
> 
> Even if you had the perfect interrupt handler that ran in
> one cycle, if you had enough of them in parallel from different ports
> there could be still a stack overflow problem on individual CPUs.

Not at all if the handler runs with irqs disabled.
 
> > The minimal intrusive way is a one liner in that very driver code and
> > if it causes problems for that very driver then we don't fix them with
> > adding a callback in the generic interrupt code path.
> 
> Ok.
> 
> > 
> > The message which we would send out with applying that band aid would
> > be simply: Go ahead driver writers and let your handlers run as long
> 
> Well it's simply the current state of affairs today. I'm merely
> attempting to make the current state slightly safer without breaking
> anything in the process.

Well, I'd agree if those stack overflows would be a massive reported
problem. 

Right now they happen with a weird test case which points out a
trouble spot. Multi vector NICs under heavy load. So why not go there
and change the handful of drivers to run their handlers with irqs
disabled?

Band aids are the last resort if we can't deal with a problem by other
sane means. And this problem falls not into that category, it can be
solved in the affected drivers with zero effort.

Thanks,

	tglx



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ