lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPM31RKnCkjL1Rf8_0-9ok_ZMi0G2iQtxi0KWnW4rSbsEtbQbw@mail.gmail.com>
Date:	Wed, 16 Oct 2013 15:16:44 -0700
From:	Paul Turner <pjt@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ben Segall <bsegall@...gle.com>, Ingo Molnar <mingo@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4/5] sched: Guarantee new group-entities always have weight

On Wed, Oct 16, 2013 at 3:01 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, Oct 16, 2013 at 11:16:27AM -0700, Ben Segall wrote:
>> From: Paul Turner <pjt@...gle.com>
>>
>> Currently, group entity load-weights are initialized to zero. This
>> admits some races with respect to the first time they are re-weighted in
>> earlty use. ( Let g[x] denote the se for "g" on cpu "x". )
>>
>> Suppose that we have root->a and that a enters a throttled state,
>> immediately followed by a[0]->t1 (the only task running on cpu[0])
>> blocking:
>>
>> put_prev_task(group_cfs_rq(a[0]), t1)
>> put_prev_entity(..., t1)
>> check_cfs_rq_runtime(group_cfs_rq(a[0]))
>> throttle_cfs_rq(group_cfs_rq(a[0]))
>>
>> Then, before unthrottling occurs, let a[0]->b[0]->t2 wake for the first
>> time:
>>
>> enqueue_task_fair(rq[0], t2)
>> enqueue_entity(group_cfs_rq(b[0]), t2)
>> enqueue_entity_load_avg(group_cfs_rq(b[0]), t2)
>> account_entity_enqueue(group_cfs_ra(b[0]), t2)
>> update_cfs_shares(group_cfs_rq(b[0]))
>> < skipped because b is part of a throttled hierarchy >
>> enqueue_entity(group_cfs_rq(a[0]), b[0])
>> ...
>>
>> We now have b[0] enqueued, yet group_cfs_rq(a[0])->load.weight == 0
>> which violates invariants in several code-paths. Eliminate the
>> possibility of this by initializing group entity weight.
>>
>> Signed-off-by: Paul Turner <pjt@...gle.com>
>> ---
>>  kernel/sched/fair.c |    3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index fc44cc3..424c294 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -7207,7 +7207,8 @@ void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
>>               se->cfs_rq = parent->my_q;
>>
>>       se->my_q = cfs_rq;
>> -     update_load_set(&se->load, 0);
>> +     /* guarantee group entities always have weight */
>> +     update_load_set(&se->load, NICE_0_LOAD);
>>       se->parent = parent;
>>  }
>
> Hurm.. this gives new groups a massive weight; nr_cpus * NICE_0. ISTR
> there being some issues with this; or was that on the wakeup path where
> a task woke on a cpu who's group entity had '0' load because it used to
> run on another cpu -- I can't remember.

This could also arbitrarily be MIN_WEIGHT.

I don't think it actually matters, in practice the set of conditions
for this weight to ever see use are very specific (e.g. the race
described above). Otherwise it's always going to be re-initialized (on
first actual enqueue) to the right value.  (NICE_0_LOAD seemed to make
sense since this is what you'd "expect" for a new, single thread,
autogroup/group.)

>
> But please do expand how this isn't a problem. I suppose for the regular
> cgroup case, group creation is a rare event so nobody cares, but
> autogroups can come and go far too quickly I think.
>

This isn't typically a problem since the first enqueue will almost
universally set a weight.

An alternative considered was to remove the throttled-hierarchy check
in the re-weight; however it seemed better to make this actually an
invariant (e.g. an entity ALWAYS has some weight).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ