lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 4 Jun 2013 17:29:39 +0200
From:	Vincent Guittot <vincent.guittot@...aro.org>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	"linaro-kernel@...ts.linaro.org" <linaro-kernel@...ts.linaro.org>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH] sched: fix clear NOHZ_BALANCE_KICK

On 4 June 2013 16:44, Frederic Weisbecker <fweisbec@...il.com> wrote:
> On Tue, Jun 04, 2013 at 01:48:47PM +0200, Vincent Guittot wrote:
>> On 4 June 2013 13:19, Frederic Weisbecker <fweisbec@...il.com> wrote:
>> > On Tue, Jun 04, 2013 at 01:11:47PM +0200, Vincent Guittot wrote:
>> >> On 4 June 2013 12:26, Frederic Weisbecker <fweisbec@...il.com> wrote:
>> >> > On Tue, Jun 04, 2013 at 11:36:11AM +0200, Peter Zijlstra wrote:
>> >> >>
>> >> >> The best I can seem to come up with is something like the below; but I think
>> >> >> its ghastly. Surely we can do something saner with that bit.
>> >> >>
>> >> >> Having to clear it at 3 different places is just wrong.
>> >> >
>> >> > We could clear the flag early in scheduler_ipi() and set some
>> >> > specific value in rq->idle_balance that tells we want nohz idle
>> >> > balancing from the softirq, something like this untested:
>> >>
>> >> I'm not sure that we can have less than 2 places to clear it: cancel
>> >> place or acknowledge place otherwise we can face a situation where
>> >> idle load balance will be triggered 2 consecutive times because
>> >> NOHZ_BALANCE_KICK will be cleared before the idle load balance has
>> >> been done and had a chance to migrate tasks.
>> >
>> > I guess it depends what is the minimum value of rq->next_balance, it seems
>> > to be large enough to avoid this kind of incident. Although I don't
>> > know well the whole logic with rq->next_balance and ilb trigger so I must
>> > defer to you.
>>
>> In the trace that was showing the issue, i can see that both CPU0 and
>> CPU1 were trying to trig ILB almost simultaneously and the
>> test_and_set NOHZ_BALANCE_KICK filters one request so i would say that
>> clearing the bit before the end of the idle load balance sequence can
>> generate such sequence
>
> I see.
>
>>
>> In the sequence below, i have minimized the clear of NOHZ_BALANCE_KICK
>> in 2 places : acknowledge and cancel. I have reused part of the
>> proposal from peter which clears the bit if the condition doesn't
>> match but i have reordered the tests to done that only if all other
>> condition are matching
>>
>>  static inline bool got_nohz_idle_kick(void)
>>  {
>> - int cpu = smp_processor_id();
>> - return idle_cpu(cpu) && test_bit(NOHZ_BALANCE_KICK, nohz_flags(cpu));
>> + bool nohz_kick = test_bit(NOHZ_BALANCE_KICK, nohz_flags(cpu));
>> +
>> +       if (!nohz_kick)
>> +               return false;
>> +
>> +       if (idle_cpu(cpu) && !need_resched())
>> +               return true;
>> +
>> +       clear_bit(NOHZ_BALANCE_KICK, nohz_flags(cpu));
>> +       return false;
>>  }
>>
>>  #else /* CONFIG_NO_HZ_COMMON */
>> @@ -1393,8 +1401,9 @@ static void sched_ttwu_pending(void)
>>
>>  void scheduler_ipi(void)
>>  {
>> - if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick()
>> -    && !tick_nohz_full_cpu(smp_processor_id()))
>> + if (llist_empty(&this_rq()->wake_list)
>> + && !tick_nohz_full_cpu(smp_processor_id())
>> + && !got_nohz_idle_kick())
>>   return;
>
> But we still need got_nohz_idle_kick() to be the first check, don't we? Otherwise
> if we run an "idle -> quick task slice -> idle" sequence we may keep the flag
> but lose the notifying IPI in between.

I'm not sure to catch the sequence you are describing above: "idle ->
quick task slice -> idle".
In addition, got_nohz_idle_kick must be the last tested condition (in
my proposal) in order to clear NOHZ_BALANCE_KICK only if we are sure
that we are going to return without possibility to trig the Idle load
balance

Vincent
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ