lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20080406234833.GA12131@deepthought> Date: Mon, 7 Apr 2008 00:48:33 +0100 From: Ken Moffat <zarniwhoop@...world.com> To: Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com> Cc: Ingo Molnar <mingo@...e.hu>, "Rafael J. Wysocki" <rjw@...k.pl>, lkml <linux-kernel@...r.kernel.org>, a.p.zijlstra@...llo.nl, aneesh.kumar@...ux.vnet.ibm.com, dhaval@...ux.vnet.ibm.com, Balbir Singh <balbir@...ibm.com>, skumar@...ux.vnet.ibm.com Subject: Re: Regression in gdm-2.18 since 2.6.24 On Sat, Apr 05, 2008 at 10:03:47PM +0100, Ken Moffat wrote: > On Sat, Apr 05, 2008 at 08:10:43PM +0530, Srivatsa Vaddagiri wrote: > > > > Given that you seem to be seeing the problem even without > > CONFIG_GROUP_SCHED, only the second hunk of the patch seems to be making > > a difference for your problem i.e just the hunk below applied on > > 2.6.25-rc8 (to kernel/sched_fair.c) should fix your problem too: > > > > @@ -1145,7 +1145,7 @@ static void check_preempt_wakeup(struct > > * More easily preempt - nice tasks, while not making > > * it harder for + nice tasks. > > */ > > - if (unlikely(se->load.weight > NICE_0_LOAD)) > > + if (unlikely(se->load.weight != NICE_0_LOAD)) > > gran = calc_delta_fair(gran, &se->load); > > > > if (pse->vruntime + gran < se->vruntime) > > > > [The first hunk is a no-op under !CONFIG_GROUP_SCHED, since > > entity_is_task() is always 1 for !CONFIG_GROUP_SCHED] > > > > This second hunk changes how fast + or - niced tasks get preempted. > > > > 2.6.25-rc8 (Bad case): > > Sets preempt granularity for + niced tasks at 5ms (1 CPU) > > > > 2.6.25-rc8 + the hunk above (Good case): > > Sets preempt granularity for + niced tasks at >5ms > > > Well, I'm no longer sure exactly what was in the config, but after > I had confirmed the reversion would fix 2.6.24.4 I _did_ try just > the second part of the patch applied to 2.6.25-rc8 and it gave a 60% > success rate across 10 tests. > > > > So bumping up preempt granularity for + niced tasks seems to make things > > work for you. IMO the deeper problem lies somewhere else (perhaps is > > some race issue in gdm itself), which is easily exposed with 2.6.25-rc8 > > which lets + niced tasks be preempted quickly. > > > > I agree this is probably exposing a problem somewhere else. > > > To help validate this, can you let us know the result of tuning preempt > > granularity on native 2.6.25-rc8 (without any patches applied and > > CONFIG_GROUP_SCHED disabled)? > > > > # echo 100000000 > /proc/sys/kernel/sched_wakeup_granularity_ns > > > > To check if echo command worked, do: > > > > # cat /proc/sys/kernel/sched_wakeup_granularity_ns > > > > It should return 100000000. > > > > Now try shutting down thr' gdm and pls let me know if it makes a > > difference. > > > > -- > > Regards, > > vatsa > > Will do, but it might be a day or so before I can get to this. > > Thanks. > > Ken Well, I found your analysis convincing. Unfortunately, my hardware disagreed. Testing -rc8 with CONFIG_GROUP_SCHED disabled (a test is a mixture of 5 attempts to restart and 5 to shutdown): 1. the base version success is 4/10 2. increasing the granularity by a factor of 10 as you requested, success is 8/10 3. applying the second part of the patch (and not altering the granularity) success is 3/10 4. applying both parts of the patch (and not altering the granularity), success is 5/10. Clearly, 3/10 and 5/10 may not be meaningfully different on such a small sample size (but, 10 attempts is probably as much as my mind and blood-pressure can stand!). Whether 8/10 is meaningfully better I don't know, the point is that it still failed some of the time. At this point, I started to doubt my previous results, so I retested rc8 with CONFIG_GROUP_SCHED=y and both parts of the patch, and again success is 10/10. So, that combination has run through at least 20 shutdowns or restarts without a problem. Summary: if I apply the patch to revert both hunks, AND use CONFIG_GROUP_SCHED, everything is good. All other variations fail sooner or later within 10 tests (for the little it's worth, the longest string of successful runs between failures is 6, so a minimum of 10 tests is probably necessary before saying a version seems ok). If I was confused earlier, I guess I must be dazed and confused now! Ken -- das eine Mal als Tragödie, das andere Mal als Farce -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists