linux-kernel - Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081104140030.GA16178@elte.hu>
Date:	Tue, 4 Nov 2008 15:00:30 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Alexander van Heukelum <heukelum@...tmail.fm>
Cc:	Alexander van Heukelum <heukelum@...lshack.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, lguest@...abs.org,
	jeremy@...source.com, Steven Rostedt <srostedt@...hat.com>,
	Cyrill Gorcunov <gorcunov@...il.com>,
	Mike Travis <travis@....com>,
	Jeremy Fitzhardinge <jeremy@...p.org>
Subject: Re: [PATCH RFC/RFB] x86_64, i386: interrupt dispatch changes


* Alexander van Heukelum <heukelum@...tmail.fm> wrote:

> On Tue, 4 Nov 2008 13:42:42 +0100, "Ingo Molnar" <mingo@...e.hu> said:
> > 
> > * Alexander van Heukelum <heukelum@...lshack.com> wrote:
> > 
> > > Hi all,
> > > 
> > > An x86 processor handles an interrupt (from an external source, 
> > > software generated or due to an exception), depending on the 
> > > contents if the IDT. Normally the IDT contains mostly interrupt 
> > > gates. Linux points each interrupt gate to a unique function. Some 
> > > are specific to some task (handling traps, IPI's, ...), the others 
> > > are stubs that push the interrupt number to the stack and jump to 
> > > 'common_interrupt'.
> > > 
> > > This patch removes the need for the stubs.
> > 
> > hm, the cost would be this new code:
> > 
> > > +.p2align
> > > +ENTRY(maininterrupt)
> > >  	RING0_INT_FRAME
> > > -vector=0
> > > -.rept NR_VECTORS
> > > -	ALIGN
> > > - .if vector
> > > -	CFI_ADJUST_CFA_OFFSET -4
> > > - .endif
> > > -1:	pushl $~(vector)
> > > -	CFI_ADJUST_CFA_OFFSET 4
> > > +	push %eax
> > > +	push %eax
> > > +	mov %cs,%eax
> > > +	shr $3,%eax
> > > +	and $0xff,%eax
> > > +	not %eax
> > > +	mov %eax,4(%esp)
> > > +	pop %eax
> > >  	jmp common_interrupt
> > 
> > .. which we were able to avoid before. A couple of segment register 
> > accesses, shifts, etc to calculate the vector - each of which can be 
> > quite costly (especially the segment register access - this is a 
> > relatively rare instruction pattern).
> 
> The way it is written now is just so I did not have to change 
> common_interrupt (to keep changes small). All those accesses so 
> close together will cost some cycles, but much can be avoided if it 
> is integrated. If the precise content of the stack can be changed, 
> this could be as simple as "push %cs". Even that can be delayed, 
> because the content of the cs register will still be there.
> 
> Note that the specialized interrupts (including page fault, etc.) 
> will not go via this path. As far as I understand now, it is only 
> the interrupts from external devices that normally go via 
> common_interrupt. There I think the overhead is really tiny compared 
> to the rest of the handling of the interrupt.

no complaints from me about the cleanup/simplification effect - that's 
really great. To make the reasoning all iron-clad please post timings 
of "push %cs" costs measured via RDTSC or so - can be done in 
user-space as well. (you can simulate the entry+exit sequence in 
user-space as well and prove that the overhead is near zero.) In the 
end it could all even be faster (perhaps), besides smaller.

( another advantage is that the 6 bytes GDT descriptor is more 
  compressed and hence uses up less L1/L2 cache footprint than the 
  larger (~7 byte) trampolines we have at the moment. )

plus it's possible to observe the typical cost of irqs from user-space 
as well: run a task on a single CPU and save away all the RDTSC deltas 
that are larger than ~10 cycles - these will be the IRQ entry costs. 
Print out these deltas after 60 seconds of runtime (or something like 
that), and look at the histogram.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/