linux-kernel - Re: Mainline kernel OLTP performance update

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <496F8421.3070703@novell.com>
Date:	Thu, 15 Jan 2009 13:44:49 -0500
From:	Gregory Haskins <ghaskins@...ell.com>
To:	Steven Rostedt <srostedt@...hat.com>
CC:	Matthew Wilcox <matthew@....cx>,
	Andrew Morton <akpm@...ux-foundation.org>,
	James Bottomley <James.Bottomley@...senPartnership.com>,
	"Wilcox, Matthew R" <matthew.r.wilcox@...el.com>,
	chinang.ma@...el.com, linux-kernel@...r.kernel.org,
	sharad.c.tripathi@...el.com, arjan@...ux.intel.com,
	andi.kleen@...el.com, suresh.b.siddha@...el.com,
	harita.chilukuri@...el.com, douglas.w.styner@...el.com,
	peter.xihong.wang@...el.com, hubert.nueckel@...el.com,
	chris.mason@...cle.com, linux-scsi@...r.kernel.org,
	Andrew Vasquez <andrew.vasquez@...gic.com>,
	Anirban Chakraborty <anirban.chakraborty@...gic.com>
Subject: Re: Mainline kernel OLTP performance update

Steven Rostedt wrote:
> On Thu, 2009-01-15 at 11:00 -0700, Matthew Wilcox wrote:
>   
>> On Thu, Jan 15, 2009 at 09:44:42AM -0800, Andrew Morton wrote:
>>     
>>>> Me too.  Anecdotally, I haven't noticed this in my lab machines, but
>>>> what I have noticed is on someone else's laptop (a hyperthreaded atom)
>>>> that I was trying to demo powertop on was that IPI reschedule interrupts
>>>> seem to be out of control ... they were ticking over at a really high
>>>> rate and preventing the CPU from spending much time in the low C and P
>>>> states.  To me this implicates some scheduler problem since that's the
>>>> primary producer of IPI reschedules ... I think it wouldn't be a
>>>> significant extrapolation to predict that the scheduler might be the
>>>> cause of the above problem as well.
>>>>
>>>>         
>>> Good point.
>>>
>>> The context switch rate actually went down a bit.
>>>
>>> I wonder if the Intel test people have records of /proc/interrupts for
>>> the various kernel versions.
>>>       
>> I think Chinang does, but he's out of office today.  He did say in an
>> earlier reply:
>>
>>     
>>> I took a quick look at the interrupts figure between 2.6.24 and 2.6.27.
>>> i/o interuputs is slightly down in 2.6.27 (due to reduce throughput).
>>> But both NMI and reschedule interrupt increased.  Reschedule interrupts
>>> is 2x of 2.6.24.
>>>       
>> So if the reschedule interrupt is happening twice as often, and the
>> context switch rate is basically unchanged, I guess that means the
>> scheduler is doing a lot more work to get approximately the same
>> results.  And that seems like a bad thing.
>>     

I would be very interested in gathering some data in this area.  One
thing that pops to mind is to instrument the resched-ipi with
ftrace_printk() and gather a trace of this system in action.  I assume
that I wouldn't have access to this OLTP suite, so I may need a
volunteer to try this for me.  I could put together an instrumentation
patch for the testers convenience if they prefer.

Another data-point I wouldn't mind seeing is  looking at the scheduler
statistics, particularly with my sched-top utility, which you can find here:

http://rt.wiki.kernel.org/index.php/Schedtop_utility

(Note you may want to exclude the sched_info stats, as they are
inherently noisy and make it hard to see the real trends.  To do this
run it with: 'schedtop -x "sched_info"'

In the meantime, I will try similar approaches here on other non-OLTP
based workloads to see if I spy anything that looks amiss.
 
-Greg



Download attachment "signature.asc" of type "application/pgp-signature" (258 bytes)