lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <86802c440710052323q6cb08799nc26dcafb0324e17c@mail.gmail.com> Date: Fri, 5 Oct 2007 23:23:56 -0700 From: "Yinghai Lu" <yhlu.kernel@...il.com> To: "Eric W. Biederman" <ebiederm@...ssion.com> Cc: "Stephen Hemminger" <shemminger@...ux-foundation.org>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: MSI interrupts and disable_irq On 10/5/07, Eric W. Biederman <ebiederm@...ssion.com> wrote: > Stephen Hemminger <shemminger@...ux-foundation.org> writes: > > > On Fri, 28 Sep 2007 22:47:16 -0400 > > Jeff Garzik <jgarzik@...ox.com> wrote: > > > >> Ayaz Abdulla wrote: > >> > I am trying to track down a forcedeth driver issue described by bug 9047 > >> > in bugzilla (2.6.23-rc7-git1 forcedeth w/ MCP55 oops under heavy load). > >> > I added a patch to synchronize the timer handlers so that one handler > >> > doesn't accidently enable the IRQ while another timer handler is running > >> > (see attachment 'Add timer lock' in bug report) and for other processing > >> > protection. > >> > > >> > However, the system still had an Oops. So I added a lock around the > >> > nv_rx_process_optimized() and the Oops has not happened (see attachment > >> > 'New patch for locking' in bug report). This would imply a > >> > synchronization issue. However, the only callers of that function are > >> > the IRQ handler and the timer handlers (in non-NAPI case). The timer > >> > handlers use disable_irq so that the IRQ handler does not contend with > >> > them. It looks as if disable_irq is not working properly. > >> > > >> > This issue repros only with MSI interrupt and not legacy INTx > >> > interrupts. Any ideas? > >> > >> (added linux-kernel to CC, since I think it's more of a general kernel > >> issue) > > I didn't see anything in disable_irq that would cause it to fail in > the suggested way. But I couldn't quite convince myself we were > race free either. I didn't see anything that was specific to MSI > that would cause something. But switching from level to edge > triggered, and to a lower latency delivery path may have caused > some behavior changes. > > >> To be brutally frank, I always thought this disable_irq() mess was a > >> hack both ugly and fragile. This disable_irq() work that appeared in a > >> couple net drivers was correct at the time, so I didn't feel I had the > >> justification to reject it, but it still gave me a bad feeling. > >> > >> I think the scenario you outline is an illustration of the approach's > >> fragility: disable_irq() is a heavy hammer that originated with INTx, > >> and it relies on a chip-specific disable method (kernel/irq/manage.c) > >> that practically guarantees behavior will vary across MSI/INTx/etc. > >> > >> > >> Based on your report, it is certainly possible that there is a problem > >> with MSI's desc->chip->disable() method... but I would actually > >> recommend working around the problem by making the forcedeth locking > >> more standardized by removing all those disable_irq() hacks. > >> > > > > I'll try and clean it up if the author doesn't get to it first. > > I took a look at the underlying side of this. > > I don't know if the MSI capability for the forcedeth supports a mask > bit or not. Mine doesn't even have a msi capability. If it doesn't > support a mask bit the pci spec provides not valid way to mask the > interrupt, so what we do is actually disable the msi capability. > At which point we might get weird INTx interactions. > > We have a similar case with ioapics and INTx that also turns > a hardware level disable into a reroute to another irq command. > So I'm going to take a look and see how infrequently we can use > hardware level disabled. also the driver need to have seperate handlers for ioapic, msi, msi-x... current nv_nic_irq_optimized keep checking msi_flag... also nv_enable_hw_interrupts and nv_disable_hw_interrupts is not corresponding in MSI side.. YH - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists