lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130823110254.GU31370@twins.programming.kicks-ass.net>
Date:	Fri, 23 Aug 2013 13:02:54 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Martin Mokrejs <mmokrejs@...d.natur.cuni.cz>
Cc:	Theodore Tso <tytso@....edu>, Thomas Gleixner <tglx@...utronix.de>,
	mingo@...hat.com, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [sched_delayed] sched: RT throttling activated

On Fri, Aug 23, 2013 at 12:38:53PM +0200, Martin Mokrejs wrote:
> > It means you have (a) real-time task(s) that consume significant amount
> 
> How can I find them? 

ps -deo pid,cls,cmd | grep -e RR -e FF

Should do I suppose

> I don't think I need the RT, I have two CPU-bound
> processes and want to run them at max speed. Rest of the system is unimportant.
> 
> I still don't understand what the $subj message actually says. Does it say
> the RT-requiring task was slowed down? I am a bit lost here.

Yeah, they were forcibly stopped from running for a little while.

> > of time. At some point we throttle them in an attempt to keep the system
> > from falling over.
> 
> Will I get companion "[sched_delayed] sched: RT throttling deactivated"
> at some point?

Nope, you get that message once to tell you that we throttle RT tasks.

> Are python-based apps requiring the realtime features?

I'm fairly sure python could use the relevant scheduling classes, but I
don't speak snake so I really wouldn't know.

> I used to get the messages below which are now gone with my CPU cooler being replaced yesterday:
> 
> [ 4172.717272] CPU1: Core temperature above threshold, cpu clock throttled (total events = 153727)

> mcelog report in such cases:
> 
> Hardware event. This is not a software error.
> MCE 0
> CPU 1 THERMAL EVENT TSC 1bf82e2a146 
> TIME 1375536062 Sat Aug  3 15:21:02 2013
> Processor 1 heated above trip temperature. Throttling enabled.
> Please check your system cooling. Performance will be impacted
> STATUS 880003c3 MCGSTATUS 0
> MCGCAP c07 APICID 2 SOCKETID 0 
> CPUID Vendor Intel Family 6 Model 42

Right, those are thermal events throttling the speed of your CPU to keep
the thing from heat damaging itself.

> While my CPU cooler got replaced even now I still get (hence this email thread):
> 
> [39564.452795] blah.py[14396]: segfault at 7ff67af34a58 ip 00007ff67badff00 sp 00007fff771ce798 error 4 in libpython2.7.so.1.0[7ff67b9cf000+173000]
> [44520.259205] [sched_delayed] sched: RT throttling activated
> [48956.057816] blah.py[16623]: segfault at 2f ip 00007fd462e5d046 sp 00007fff638431e0 error 4 in libpython2.7.so.1.0[7fd462d7c000+173000]
> [49288.388797] blah.py[28631]: segfault at 7fe254b6aa58 ip 00007fe255715f00 sp 00007fff6ddaaff8 error 4 in libpython2.7.so.1.0[7fe255605000+173000]
> [49942.020084] blah.py[6950]: segfault at d0 ip 00007f3e8a9acf9c sp 00007fffa72288a0 error 4 in libpython2.7.so.1.0[7f3e8a904000+173000]
> [66696.443342] blah.py[8015]: segfault at cf ip 00007f798f708f9c sp 00007fff420336e0 error 4 in libpython2.7.so.1.0[7f798f660000+173000]
> [67561.587383] blah.py[7483]: segfault at 7f7b16e01540 ip 00007f7b17a85f00 sp 00007fffe663d9b8 error 4 in libpython2.7.so.1.0[7f7b17975000+173000]
> [77262.490502] blah.py[29107]: segfault at 21e1458 ip 00007fc54cd17f00 sp 00007fff283c5c38 error 4 in libpython2.7.so.1.0[7fc54cc07000+173000]
> 
> 
> So, what does this "[sched_delayed] sched: RT throttling activated" tell me?

That of the past 1s, 0.95s were spend running RR/FIFO tasks. It is a
warning that comes only once per boot and should prompt you to
investigate.

You can turn the throttle off, but be advised that running a RR/FIFO
task at 100% can (and generally does) negatively affect the running of
your system (as in, these tasks can prevent system duties from taking
place and eventually make the system come to a halt).


As to those faults, investigate if your python prog does something
particualrly weird or your runtime is in order. Otherwise I would advise
you to run memtest for a while to make sure your machine is in proper
working order.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ