lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xm26pnkxhz9g.fsf@bsegall-linux.svl.corp.google.com>
Date:   Thu, 22 Aug 2019 10:43:23 -0700
From:   bsegall@...gle.com
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Valentin Schneider <valentin.schneider@....com>,
        linux-kernel@...r.kernel.org, mingo@...nel.org,
        liangyan.peng@...ux.alibaba.com, shanpeic@...ux.alibaba.com,
        xlpang@...ux.alibaba.com, pjt@...gle.com, stable@...r.kernel.org
Subject: Re: [PATCH] sched/fair: Add missing unthrottle_cfs_rq()

Peter Zijlstra <peterz@...radead.org> writes:

> On Tue, Aug 20, 2019 at 11:54:20AM +0100, Valentin Schneider wrote:
>> Turns out a cfs_rq->runtime_remaining can become positive in
>> assign_cfs_rq_runtime(), but this codepath has no call to
>> unthrottle_cfs_rq().
>> 
>> This can leave us in a situation where we have a throttled cfs_rq with
>> positive ->runtime_remaining, which breaks the math in
>> distribute_cfs_runtime(): this function expects a negative value so that
>> it may safely negate it into a positive value.
>> 
>> Add the missing unthrottle_cfs_rq(). While at it, add a WARN_ON where
>> we expect negative values, and pull in a comment from the mailing list
>> that didn't make it in [1].

This didn't exist because it's not supposed to be possible to call
account_cfs_rq_runtime on a throttled cfs_rq at all, so that's the
invariant being violated. Do you know what the code path causing this
looks like?

This would allow both list del and add while distribute is doing a
foreach, but I think that the racing behavior would just be to restart
the distribute loop, which is fine.



>> 
>> [1]: https://lkml.kernel.org/r/BANLkTi=NmCxKX6EbDQcJYDJ5kKyG2N1ssw@mail.gmail.com
>> 
>> Cc: <stable@...r.kernel.org>
>> Fixes: ec12cb7f31e2 ("sched: Accumulate per-cfs_rq cpu usage and charge against bandwidth")
>> Reported-by: Liangyan <liangyan.peng@...ux.alibaba.com>
>> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
>
> Thanks!
>
>> ---
>>  kernel/sched/fair.c | 17 ++++++++++++-----
>>  1 file changed, 12 insertions(+), 5 deletions(-)
>> 
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 1054d2cf6aaa..219ff3f328e5 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4385,6 +4385,11 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq)
>>  	return rq_clock_task(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time;
>>  }
>>  
>> +static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
>> +{
>> +	return cfs_bandwidth_used() && cfs_rq->throttled;
>> +}
>> +
>>  /* returns 0 on failure to allocate runtime */
>>  static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
>>  {
>> @@ -4411,6 +4416,9 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
>>  
>>  	cfs_rq->runtime_remaining += amount;
>>  
>> +	if (cfs_rq->runtime_remaining > 0 && cfs_rq_throttled(cfs_rq))
>> +		unthrottle_cfs_rq(cfs_rq);
>> +
>>  	return cfs_rq->runtime_remaining > 0;
>>  }
>>  
>> @@ -4439,11 +4447,6 @@ void account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec)
>>  	__account_cfs_rq_runtime(cfs_rq, delta_exec);
>>  }
>>  
>> -static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
>> -{
>> -	return cfs_bandwidth_used() && cfs_rq->throttled;
>> -}
>> -
>>  /* check whether cfs_rq, or any parent, is throttled */
>>  static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
>>  {
>> @@ -4628,6 +4631,10 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, u64 remaining)
>>  		if (!cfs_rq_throttled(cfs_rq))
>>  			goto next;
>>  
>> +		/* By the above check, this should never be true */
>> +		WARN_ON(cfs_rq->runtime_remaining > 0);
>> +
>> +		/* Pick the minimum amount to return to a positive quota state */
>>  		runtime = -cfs_rq->runtime_remaining + 1;
>>  		if (runtime > remaining)
>>  			runtime = remaining;
>> -- 
>> 2.22.0
>> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ