linux-kernel - Re: report a bug about sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 25 Jul 2009 21:44:12 -0500
From:	Bill Gatliff <bgat@...lgatliff.com>
To:	Jamie Lokier <jamie@...reable.org>
CC:	Peter Zijlstra <peterz@...radead.org>,
	sen wang <wangsen.linux@...il.com>, mingo@...e.hu,
	akpm@...ux-foundation.org, kernel@...ivas.org, npiggin@...e.de,
	arjan@...radead.org, linux-arm-kernel@...ts.arm.linux.org.uk,
	linux-kernel@...r.kernel.org
Subject: Re: report a bug about sched_rt

Jamie Lokier wrote:
> Bill Gatliff wrote:
>   
>> Jamie Lokier wrote:
>>     
>>> For simple things like "try to keep the buffer to my DVD writer full"
>>> (no I don't know how much CPU that requires - it's a kind of "best
>>> effort but try very hard!"), it would be quite useful to have
>>> something like RT-bandwidth which grants a certain percentage of time
>>> as an RT task, and effectively downgrades it to SCHED_OTHER when that
>>> time is exceeded to permit some fairness with the rest of the system.
>>>  
>>>       
>> Useful perhaps, but an application design that explicitly communicates 
>> your desires to the scheduler will be more robust, even if it does seem 
>> more complex at the outset.
>>     
>
> I agree with communicting the desire explicitly to the scheduler.
>
> In the above example, the exact desire is "give me as much CPU as I
> ask for, because my hardware servicing will be adversely but
> non-fatally affected if you don't, and the amount of CPU needed to
> service the hardware cannot be determined in advance, but prevent me
> from blocking progress in the rest of the system by limiting my
> exclusive ownership of the CPU".
>
> How do you propose to communicate that to the scheduler, if not by
> something rather like RT-bandwidth with downgrading to SCHED_OTHER
> when a policy limit is exceeded?
>   

This is a great real-world problem.  And there's no one-size-fits-all 
answer, unfortunately.

RT-bandwidth will give you the system behavior you are after, but it's a 
pretty blunt instrument.

I'd consider putting some throttling in your interrupt handler that 
prevents it from running more than a certain amount of calculation per 
interrupt event.  And perhaps it's looking at execution timestamps to 
determine how often it's running, and can therefore do a rough 
calculation of how much CPU it's eating.  At least until threaded 
interrupt scheduling is widespread, a runaway interrupt handler is 
definitely an opportunity to hang up a system.

Tasklets are nice for this, because the scheduler won't re-queue one if 
it's already running.  So if your interrupt handler's job is just to 
launch the tasklet, and you know how much time the tasklet takes to run, 
then if you get a burst of interrupts you don't end up launching an 
equivalent burst of scheduled work: eventually the interrupt handler 
overtakes the tasklet, and the additional interrupt events get dropped.  
That's often a decent way to deal with system overload, especially if it 
leaves the system functional enough to take some sort of "evasive 
action" like reverting to polled i/o, issuing a diagnostic message, or 
doing an orderly transition to a safe mode.

A flood ping, lots of paging, and driver bugs are just a few ways you 
can encounter an unexpected burst of interrupt activity that might, if 
not dealt with on some level, cause the system to suddenly destabilize.

Point is, keep a mentality that you want to fall back onto RT-bandwidth 
(or any other type of watchdog timer expiration) only after you've 
exhausted all other options.  Pretend it isn't there--- but definitely 
know what will happen if it ever steps in.  A system coded that way is 
much more resistant to breakage, in my experience anyway.

b.g.

-- 
Bill Gatliff
bgat@...lgatliff.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/