lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d248ea6-f861-850-ba71-ac2cdd5596ff@telegraphics.com.au>
Date:   Thu, 11 Feb 2021 12:11:49 +1100 (AEDT)
From:   Finn Thain <fthain@...egraphics.com.au>
To:     "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
cc:     tanxiaofei <tanxiaofei@...wei.com>,
        "jejb@...ux.ibm.com" <jejb@...ux.ibm.com>,
        "martin.petersen@...cle.com" <martin.petersen@...cle.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linuxarm@...neuler.org" <linuxarm@...neuler.org>,
        "linux-m68k@...r.kernel.org" <linux-m68k@...r.kernel.org>
Subject: RE: [Linuxarm] Re: [PATCH for-next 00/32] spin lock usage optimization
 for SCSI drivers

On Wed, 10 Feb 2021, Song Bao Hua (Barry Song) wrote:

> > On Wed, 10 Feb 2021, Song Bao Hua (Barry Song) wrote:
> > 
> > > > On Wed, 10 Feb 2021, Song Bao Hua (Barry Song) wrote:
> > > >
> > > > > > There is no warning from m68k builds. That's because 
> > > > > > arch_irqs_disabled() returns true when the IPL is non-zero.
> > > > >
> > > > > So for m68k, the case is arch_irqs_disabled() is true, but 
> > > > > interrupts can still come?
> > > > >
> > > > > Then it seems it is very confusing. If prioritized interrupts 
> > > > > can still come while arch_irqs_disabled() is true,
> > > >
> > > > Yes, on m68k CPUs, an IRQ having a priority level higher than the 
> > > > present priority mask will get serviced.
> > > >
> > > > Non-Maskable Interrupt (NMI) is not subject to this rule and gets 
> > > > serviced regardless.
> > > >
> > > > > how could spin_lock_irqsave() block the prioritized interrupts?
> > > >
> > > > It raises the the mask level to 7. Again, please see 
> > > > arch/m68k/include/asm/irqflags.h
> > >
> > > Hi Finn,
> > > Thanks for your explanation again.
> > >
> > > TBH, that is why m68k is so confusing. irqs_disabled() on m68k 
> > > should just reflect the status of all interrupts have been disabled 
> > > except NMI.
> > >
> > > irqs_disabled() should be consistent with the calling of APIs such 
> > > as local_irq_disable, local_irq_save, spin_lock_irqsave etc.
> > >
> > 
> > When irqs_disabled() returns true, we cannot infer that 
> > arch_local_irq_disable() was called. But I have not yet found driver 
> > code or core kernel code attempting that inference.
> > 
> > > >
> > > > > Isn't arch_irqs_disabled() a status reflection of irq disable 
> > > > > API?
> > > > >
> > > >
> > > > Why not?
> > >
> > > If so, arch_irqs_disabled() should mean all interrupts have been 
> > > masked except NMI as NMI is unmaskable.
> > >
> > 
> > Can you support that claim with a reference to core kernel code or 
> > documentation? (If some arch code agrees with you, that's neither here 
> > nor there.)
> 
> I think those links I share you have supported this. Just you don't 
> believe :-)
> 

Your links show that the distinction between fast and slow handlers was 
removed. Your links don't support your claim that "arch_irqs_disabled() 
should mean all interrupts have been masked". Where is the code that makes 
that inference? Where is the documentation that supports your claim?

> > 
> > > >
> > > > Are all interrupts (including NMI) masked whenever 
> > > > arch_irqs_disabled() returns true on your platforms?
> > >
> > > On my platform, once irqs_disabled() is true, all interrupts are 
> > > masked except NMI. NMI just ignore spin_lock_irqsave or 
> > > local_irq_disable.
> > >
> > > On ARM64, we also have high-priority interrupts, but they are 
> > > running as PESUDO_NMI:
> > > https://lwn.net/Articles/755906/
> > >
> > 
> > A glance at the ARM GIC specification suggests that your hardware 
> > works much like 68000 hardware.
> > 
> >    When enabled, a CPU interface takes the highest priority pending 
> >    interrupt for its connected processor and determines whether the 
> >    interrupt has sufficient priority for it to signal the interrupt 
> >    request to the processor. [...]
> > 
> >    When the processor acknowledges the interrupt at the CPU interface, 
> >    the Distributor changes the status of the interrupt from pending to 
> >    either active, or active and pending. At this point the CPU 
> >    interface can signal another interrupt to the processor, to preempt 
> >    interrupts that are active on the processor. If there is no pending 
> >    interrupt with sufficient priority for signaling to the processor, 
> >    the interface deasserts the interrupt request signal to the 
> >    processor.
> > 
> > https://developer.arm.com/documentation/ihi0048/b/
> > 
> > Have you considered that Linux/arm might benefit if it could fully 
> > exploit hardware features already available, such as the interrupt 
> > priority masking feature in the GIC in existing arm systems?
> 
> I guess no:-) there are only two levels: IRQ and NMI. Injecting a 
> high-prio IRQ level between them makes no sense.
> 
> To me, arm64's design is quite clear and has no any confusion.
> 

Are you saying that the ARM64 hardware design is confusing because it 
implements a priority mask, and that's why you had to simplify it with a 
pseudo-nmi scheme in software?

> > 
> > > On m68k, it seems you mean:
> > > irq_disabled() is true, but high-priority interrupts can still come;
> > > local_irq_disable() can disable high-priority interrupts, and at that
> > > time, irq_disabled() is also true.
> > >
> > > TBH, this is wrong and confusing on m68k.
> > >
> > 
> > Like you, I was surprised when I learned about it. But that doesn't mean
> > it's wrong. The fact that it works should tell you something.
> > 
> 
> The fact is that m68k lets arch_irq_disabled() return true to pretend 
> all IRQs are disabled while high-priority IRQ is still open, thus "pass" 
> all sanitizing check in genirq and kernel core.
> 

The fact is that m68k has arch_irq_disabled() return false when all IRQs 
are enabled. So there is no bug.

> > Things could always be made simpler. But discarding features isn't 
> > necessarily an improvement.
> 
> This feature could be used by calling local_irq_enable_in_hardirq() in 
> those IRQ handlers who hope high-priority interrupts to preempt it for a 
> while.
> 

So, if one handler is sensitive to interrupt latency, all other handlers 
should be modified? I don't think that's workable.

In anycase, what you're describing is a completely different nested 
interrupt scheme that would defeat the priority level mechanism that the 
hardware provides us with.

> It shouldn't hide somewhere and make confusion.
> 

The problem is hiding so well that no-one has found it! I say it doesn't 
exist.

> On the other hand, those who care about realtime should use threaded IRQ 
> and let IRQ threads preempt each other.
> 

Yes. And those threads also have priority levels.

> Thanks
> Barry
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ