lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 21 Aug 2019 08:20:48 +0000
From:   Long Li <longli@...rosoft.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        "longli@...uxonhyperv.com" <longli@...uxonhyperv.com>
CC:     Ingo Molnar <mingo@...hat.com>,
        Keith Busch <keith.busch@...el.com>, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 1/3] sched: define a function to report the number of
 context switches on a CPU

>>>Subject: Re: [PATCH 1/3] sched: define a function to report the number of
>>>context switches on a CPU
>>>
>>>On Mon, Aug 19, 2019 at 11:14:27PM -0700, longli@...uxonhyperv.com
>>>wrote:
>>>> From: Long Li <longli@...rosoft.com>
>>>>
>>>> The number of context switches on a CPU is useful to determine how
>>>> busy this CPU is on processing IRQs. Export this information so it can
>>>> be used by device drivers.
>>>
>>>Please do explain that; because I'm not seeing how number of switches
>>>relates to processing IRQs _at_all_!

Some kernel components rely on context switch to progress, for example watchdog and RCU. On a CPU with reasonable interrupt load, it continues to make context switches, normally a number of switches per seconds. 

While observing a CPU with heavy interrupt loads, I see that it spends all its time in IRQ and softIRQ, and not to get a chance to do a switch (calling __schedule()) for a long time. This will result in system unresponsive at times. The purpose is to find out if this CPU is in this state, and implement some throttling mechanism to help reduce the number of interrupts. I think the number of switches is not accurate for detecting this condition in the most precise way, but maybe it's good enough.

I agree this may not be the best way. If you have other idea on detecting a CPU is swamped by interrupts, please point me to where to look at.

Thanks

Long


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ