linux-kernel - Re: Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate ISA irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0611130742440.22714@g5.osdl.org>
Date:	Mon, 13 Nov 2006 08:02:28 -0800 (PST)
From:	Linus Torvalds <torvalds@...l.org>
To:	Komuro <komurojun-mbn@...ty.com>
cc:	tglx@...utronix.de, "Eric W. Biederman" <ebiederm@...ssion.com>,
	Adrian Bunk <bunk@...sta.de>, Andrew Morton <akpm@...l.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	mingo@...hat.com
Subject: Re: Re: Re: 2.6.19-rc5: known regressions :SMP kernel can not generate
 ISA irq

On Fri, 10 Nov 2006, Komuro wrote:
> 
> I tried the 2.6.19-rc5,  the problem still happens.

Ok, that's good data, and especially:

> But,
> I remove the disable_irq_nosync() , enable_irq()
> from the linux/drivers/net/pcmcia/axnet_cs.c
> the interrupt is generated properly.

All RIGHT. That's a very good clue. The major difference between PCI and 
ISA irq's is that they have different trigger types (they also have 
different polarity, but that tends to be just a small detail). In 
particular, ISA IRQ's are edge-triggered, and PCI IRQ's are level- 
triggered.

Now, edge-triggered interrupts are a _lot_ harder to mask, because the 
Intel APIC is an unbelievable piece of sh*t, and has the edge-detect logic 
_before_ the mask logic, so if a edge happens _while_ the device is 
masked, you'll never ever see the edge ever again (unmasking will not 
cause a new edge, so you simply lost the interrupt).

So when you "mask" an edge-triggered IRQ, you can't really mask it at all, 
because if you did that, you'd lose it forever if the IRQ comes in while 
you masked it. Instead, we're supposed to leave it active, and set a flag, 
and IF the IRQ comes in, we just remember it, and mask it at that point 
instead, and then on unmasking, we have to replay it by sending a 
self-IPI.

Maybe that part got broken by some of the IRQ changes by Eric. 

Eric, can you please double-check this all? I suspect you disable 
edge-triggered interrupts when moving them, or something, and maybe you 
didn't realize that if you disable them on the IO-APIC level, they can be 
gone forever.

[ Note: this is true EVEN IF we are in the interrupt handler right then - 
  if we get another edge while in the interrupt handler, the interrupt 
  will normally be _delayed_ until we've ACK'ed it, but if we have 
  _masked_ it, it will simply be lost entirely. So a simple "mask" 
  operation is always incorrect for edge-triggered interrupts.

  One option might be to do a simple mask, and on unmask, turn the edge 
  trigger into a level trigger at the same time. Then, the first time you 
  get the interrupt, you turn it back into an edge trigger _before_ you 
  call the interrupt handlers. That might actually be simpler than doing 
  the "irq replay" dance with self-IPI, because we can't actually just 
  fake the IRQ handling - when enable_irq() is called, irq's are normally 
  disabled on the CPU, so we can't just call the irq handler at that 
  point: we really do need to "replay" the dang thing.

  Did I mention that the Intel APIC's are a piece of cr*p already? ]

> So I think enable_irq does not enable the irq.

It probably does enable it (that's the easy part), but see above: if any 
of the support structure for the APIC crapola is subtly broken, we'll have 
lost the IRQ anyway.

(Many other IRQ controllers get this right: the "old and broken" Intel 
i8259 interrupt controller was a much better IRQ controller than the APIC 
in this regard, because it simply had the edge-detect logic after the 
masking logic, so if you unmasked an active interrupt that had been 
masked, you would always see it as an edge, and the i8259 controller needs 
none of the subtle code at _all_. It just works.)

Anyway, if you _can_ bisect the exact point where this started happening, 
that would be good. But I would not be surprised in the least if this is 
all introduced by Eric Biedermans dynamic IRQ handling.

Eric?

			Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/