lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 1 Nov 2011 12:14:30 +0800 From: Zhu Yanhai <zhu.yanhai@...il.com> To: Henrique de Moraes Holschuh <hmh@....eng.br> Cc: "Artem S. Tashkinov" <t.artem@...os.com>, linux-kernel@...r.kernel.org Subject: Re: HT (Hyper Threading) aware process scheduling doesn't work as it should Hi, I think the unbalance has got much better on mainline kernel than OS vendor's, i.e. RHEL6. Just in case you are interested, below is a very simple test case I used before against NUMA + CFS group scheduling extension. I have tested this on a dual-soket Xeon E5620 server. cat bbb.c int main() { while(1) { }; } cat run.sh #!/bin/sh count=0 pids=" " while [ $count -lt 32 ] do mkdir /cgroup/$count echo 1024 > /cgroup/$count/cpu.shares # taskset -c 4,5,6,7,12,13,14,15 ./bbb & ./bbb & pid=`echo $!` echo $pid > /cgroup/$count/tasks pids=`echo $pids" "$pid` count=`expr $count + 1` done echo "for pid in $pids;do cat /proc/$pid/sched|grep sum_exec_runtime;done" > show.sh watch -n1 sh show.sh Since one e5620 with HT enabled has 8 logical cpus, this dual-socket box has 16 logical cpus in total. The above test script starts 32 processes, so the intuitively guess is each two of them will run on one logical cpu. However it's not for current RHEL6 kernel, top shows that they are keeping migrating and often unbalanced, sometimes worse and sometimes better. If you watch it for a long time, you may find sometimes one process occupy the whole logical cpu for a moment, and several process (far more than 2) congest on a single cpu slot. Also the 'watch' output shows that the sum_exec_runtime is almost the same of them, so it seems that the RHEL6 kernel is trying to move a lucky guy to a free cpu slot, make it hold that position for a while, then move the next lucky guy there and kick off the previous one to a crowded slot, which is not a good policy for such totally Independent processes. And on the mainline kernel(3.0.0+), they run much more balanced that above, although I can't identify which commits made this. -- Regards, Zhu Yanhai 2011/10/31 Henrique de Moraes Holschuh <hmh@....eng.br>: > On Mon, 31 Oct 2011, Artem S. Tashkinov wrote: >> > On Oct 31, 2011, Henrique de Moraes Holschuh wrote: >> > >> > On Sun, 30 Oct 2011, Artem S. Tashkinov wrote: >> > > > Please make sure both are set to 0. If they were not 0 at the time you >> > > > ran your tests, please retest and report back. >> > > >> > > That's 0 & 0 for me. >> > >> > How idle is your system during the test? >> >> load average: 0.00, 0.00, 0.00 > > I believe cpuidle will interfere with the scheduling in that case. Could > you run your test with higher loads (start with one, and go up to eight > tasks that are CPU-hogs, measuring each step)? > >> I have to insist that people conduct this test on their own without trusting my >> words. Probably there's something I overlook or don't fully understand but from > > What you should attempt to do is to give us a reproducible test case. A > shell script or C/perl/python/whatever program that when run clearly shows > the problem you're complaining about on your system. Failing that, a very > detailed description (read: step by step) of how you're testing things. > > I can't see anything wrong in my X5550 workstation (4 cores, 8 threads, > single processor, i.e. not NUMA) running 3.0.8. > >> what I see, there's a serious issue here (at least Microsoft XP and 7 work exactly > > So far it looks like that, since your system is almost entirely idle, it > could be trying to minimize task-run latency by scheduling work to the few > cores/threads that are not in deep sleep (they take time to wake up, are > often cache-cold, etc). > > Please use tools/power/x86/turbostat to track core usage and idle-states > instead of top/htop. That might give you better information, and I > think you will appreciate getting to know that tool. Note: turbostat > reports *averages* for each thread. > > -- > "One disk to rule them all, One disk to find them. One disk to bring > them all and in the darkness grind them. In the Land of Redmond > where the shadows lie." -- The Silicon Valley Tarot > Henrique Holschuh > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@...r.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists