lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <49C3E03E.10506@acm.org>
Date:	Fri, 20 Mar 2009 13:28:14 -0500
From:	Corey Minyard <minyard@....org>
To:	Greg KH <greg@...ah.com>
Cc:	Martin Wilck <martin.wilck@...itsu-siemens.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	openipmi-developer@...ts.sourceforge.net
Subject: Re: [PATCH] limit CPU time spent in kipmid

Greg KH wrote:
> On Fri, Mar 20, 2009 at 10:30:45AM -0500, Corey Minyard wrote:
>   
>> Greg KH wrote:
>>     
>>> On Thu, Mar 19, 2009 at 04:31:00PM -0500, Corey Minyard wrote:
>>>   
>>>       
>>>> Martin, thanks for the patch.  I had actually implemented something like 
>>>> this before, and it didn't really help very much with the hardware I had, 
>>>> so I had abandoned this method.  There's even a comment about it in 
>>>> si_sm_result smi_event_handler(). Maybe making it tunable is better, I 
>>>> don't know.  But I'm afraid this will kill performance on a lot of 
>>>> systems.
>>>>
>>>> Did you test throughput on this?  The main problem people had without 
>>>> kipmid was that things like firmware upgrades took a *long* time; adding 
>>>> kipmid improved speeds by an order of magnitude or more.
>>>>
>>>> It's my opinion that if you want this interface to work efficiently with 
>>>> good performance, you should design the hardware to be used efficiently 
>>>> by using interrupts (which are supported and disable kipmid).  With the 
>>>> way the hardware is defined, you cannot have both good performance and 
>>>> low CPU usage without interrupts.
>>>>
>>>> It may be possible to add an option to choose between performance and 
>>>> efficiency, but it will have to default to performance.
>>>>     
>>>>         
>>> I would think that very infrequent things, like firmware upgrades, would
>>> not take priority over a long-term "keep the cpu busy" type system, like
>>> what we currently have.
>>>
>>> Is there any way to switch between the different modes dynamically?
>>>   I like the idea of this change, as I have got a lot of complaints lately
>>> about kipmi taking way too much cpu time up on idle systems, messing up
>>> some user's process accounting rules in their management systems.  But I
>>> worry about making it a module parameter, why can't this be a
>>> "self-tunable" thing?
>>>   
>>>       
>> It's actually already sort of self-tuning.  kipmid sleeps unless there is 
>> IPMI activity.  It only spins if it is expecting something from the 
>> controller.
>>
>> I've been thinking about this a little more.  Assuming that the self-tuning 
>> is working (and it appears to be working fine on my systems), that means 
>> that something is causing the IPMI driver to constantly talk to the 
>> management controller.  I can think of three things:
>>
>>   1. The user is constantly sending messages to management controller.
>>   2. There is something wrong with the hardware, like the ATTN bit is
>>      stuck high, causing the driver to constantly poll the management
>>      controller.
>>   3. The driver either has a bug or needs some more work to account for
>>      something the hardware needs it to do to clear the ATTN bit.
>>
>> If it's #1 above, then I don't know if there is anything we can do about 
>> it.  The patch Martin sent will simply slow things down.
>>     
>
> Does the "normal" ipmi userspace tools do #1?
>   
That depends how they are used and configured.  If you make them 
constantly poll for events or grab sensor values, then they will just 
use CPU.  By default they shouldn't do anything.

> For #2, this might make sense, as I have had reports of some hardware
> working just fine, while others have the load issue.  Both were
> different hardware manufacturers.
>
>   
>> #2 and #3 will require someone to do some debugging.  If the ATTN bit is 
>> stuck, you should see the "attentions" field in /proc/ipmi/0/si_stats 
>> constantly going up.  Actually, the contents of that file would be helpful, 
>> along with /proc/ipmi/0/stats.
>>     
>
> Martin has one of these machines, right?  If not, I can dig and try to
> get some information as well.
>   
I'll wait for Martin, hopefully he can get the info.

Thanks,

-corey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ