lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <xm26fw6xd7cu.fsf@sword-of-the-dawn.mtv.corp.google.com>
Date:	Tue, 04 Sep 2012 10:29:37 -0700
From:	Benjamin Segall <bsegall@...gle.com>
To:	Preeti Murthy <preeti.lkml@...il.com>
Cc:	pjt@...gle.com, linux-kernel@...r.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Venki Pallipadi <venki@...gle.com>,
	Mike Galbraith <efault@....de>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Nikunj A Dadhania <nikunj@...ux.vnet.ibm.com>,
	Morten Rasmussen <Morten.Rasmussen@....com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Namhyung Kim <namhyung@...nel.org>
Subject: Re: [patch 06/16] sched: account for blocked load waking back up

Preeti Murthy <preeti.lkml@...il.com> writes:

> Hi Paul,
>
>     @@ -1170,20 +1178,42 @@ static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq,
>                                                       struct sched_entity *se,
>                                                       int wakeup)
>      {
>     -       /* we track migrations using entity decay_count == 0 */
>     -       if (unlikely(!se->avg.decay_count)) {
>     +       /*
>     +        * We track migrations using entity decay_count <= 0, on a wake-up
>     +        * migration we use a negative decay count to track the remote decays
>     +        * accumulated while sleeping.
>     +        */
>     +       if (unlikely(se->avg.decay_count <= 0)) {
>                     se->avg.last_runnable_update = rq_of(cfs_rq)->clock_task;
>     +               if (se->avg.decay_count) {
>     +                       /*
>     +                        * In a wake-up migration we have to approximate the
>     +                        * time sleeping.  This is because we can't synchronize
>     +                        * clock_task between the two cpus, and it is not
>     +                        * guaranteed to be read-safe.  Instead, we can
>     +                        * approximate this using our carried decays, which are
>     +                        * explicitly atomically readable.
>     +                        */
>     +                       se->avg.last_runnable_update -= (-se->avg.decay_count)
>     +                                                       << 20;
>     +                       update_entity_load_avg(se, 0);
>     +                       /* Indicate that we're now synchronized and on-rq */
>     +                       se->avg.decay_count = 0;
>     +               }
>                     wakeup = 0;
>             } else {
>                     __synchronize_entity_decay(se);
>
>  
> Should not the last_runnable_update of se get updated in __synchronize_entity_decay()?
> Because it contains the value of the runnable update before going to sleep.If not updated,when
> update_entity_load_avg() is called below during a local wakeup,it will decay the runtime load
> for the duration including the time the sched entity has slept.

If you are asking if it should be updated in the else block (local
wakeup, no migration) here, no:

* __synchronize_entity_decay will decay load_avg_contrib to match the
  decay that the cfs_rq has done, keeping those in sync, and ensuring we
  don't subtract too much when we update our current load average.
* clock_task - last_runnable_update will be the amount of time that the
  task has been blocked. update_entity_load_avg (below) and
  __update_entity_runnable_avg will account this time as non-runnable
  time into runnable_avg_sum/period, and from there onto the cfs_rq via
  __update_entity_load_avg_contrib.

Both of these are necessary, and will happen. In the case of !wakeup,
the task is being moved between groups or is migrating between cpus, and
we pretend (to the best of our ability in the case of migrating between
cpus which may have different clock_tasks) that the task has been
runnable this entire time.

In the more general case, no, it is called from migrate_task_rq_fair,
which doesn't have the necessary locks to read clock_task.

>
> This also means that during dequeue_entity_load_avg(),update_entity_load_avg() needs to be
> called to keep the runnable_avg_sum of the sched entity updated till
> before sleep.

Yes, this happens first thing in dequeue_entity_load_avg.
>
>             }
>    
>     -       if (wakeup)
>     +       /* migrated tasks did not contribute to our blocked load */
>     +       if (wakeup) {
>                     subtract_blocked_load_contrib(cfs_rq, se->avg.load_avg_contrib);
>     +               update_entity_load_avg(se, 0);
>     +       }
>    
>     -       update_entity_load_avg(se, 0);
>             cfs_rq->runnable_load_avg += se->avg.load_avg_contrib;
>     -       update_cfs_rq_blocked_load(cfs_rq);
>     +       /* we force update consideration on load-balancer moves */
>     +       update_cfs_rq_blocked_load(cfs_rq, !wakeup);
>      }
>    
>       --
>     To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>     the body of a message to majordomo@...r.kernel.org
>     More majordomo info at  http://vger.kernel.org/majordomo-info.html
>     Please read the FAQ at  http://www.tux.org/lkml/
>
> Regards
> Preeti

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ