lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtBT5h1dajqPHk8rURbhfY2et592w5KiC13p4FL3=8-cgQ@mail.gmail.com>
Date:	Tue, 12 Feb 2013 14:29:06 +0100
From:	Vincent Guittot <vincent.guittot@...aro.org>
To:	rostedt@...dmis.org
Cc:	linux-kernel@...r.kernel.org, linaro-dev@...ts.linaro.org,
	peterz@...radead.org, mingo@...nel.org, fweisbec@...il.com,
	efault@....de
Subject: Re: [PATCH v2] sched: fix wrong rq's runnable_avg update with rt task

On 12 February 2013 14:23, Vincent Guittot <vincent.guittot@...aro.org> wrote:
> When a RT task is scheduled on an idle CPU, the update of the rq's load is
> not done because CFS's functions are not called. Then, the idle_balance,
> which is called just before entering the idle function, updates the
> rq's load and makes the assumption that the elapsed time since the last
> update, was only running time.
>
> The rq's load of a CPU that only runs a periodic RT task, is close to
> LOAD_AVG_MAX whatever the running duration of the RT task is.
>
> A new idle_exit function is called when the prev task is the idle function
> so the elapsed time will be accounted as idle time in the rq's load.
>
> Changes since V1:
> - move code out of schedule function and create a pre_schedule callback for
>   idle class instead.

Hi Steve,

I have pushed a new version of my patch to have comments about the
proposed solution but I will rebase it on top of your work when
available

Vincent

>
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
>  kernel/sched/fair.c      |   10 ++++++++++
>  kernel/sched/idle_task.c |    7 +++++++
>  kernel/sched/sched.h     |    5 +++++
>  3 files changed, 22 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 81fa536..60951f1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1562,6 +1562,16 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq,
>                 se->avg.decay_count = atomic64_read(&cfs_rq->decay_counter);
>         } /* migrations, e.g. sleep=0 leave decay_count == 0 */
>  }
> +
> +/*
> + * Update the rq's load with the elapsed idle time before a task is
> + * scheduled. if the newly scheduled task is not a CFS task, idle_exit will
> + * be the only way to update the runnable statistic.
> + */
> +void idle_exit(int this_cpu, struct rq *this_rq)
> +{
> +       update_rq_runnable_avg(this_rq, 0);
> +}
>  #else
>  static inline void update_entity_load_avg(struct sched_entity *se,
>                                           int update_cfs_rq) {}
> diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c
> index b6baf37..27cd379 100644
> --- a/kernel/sched/idle_task.c
> +++ b/kernel/sched/idle_task.c
> @@ -13,6 +13,12 @@ select_task_rq_idle(struct task_struct *p, int sd_flag, int flags)
>  {
>         return task_cpu(p); /* IDLE tasks as never migrated */
>  }
> +
> +static void pre_schedule_idle(struct rq *rq, struct task_struct *prev)
> +{
> +       /* Update rq's load with elapsed idle time */
> +       idle_exit(smp_processor_id(), rq);
> +}
>  #endif /* CONFIG_SMP */
>  /*
>   * Idle tasks are unconditionally rescheduled:
> @@ -86,6 +92,7 @@ const struct sched_class idle_sched_class = {
>
>  #ifdef CONFIG_SMP
>         .select_task_rq         = select_task_rq_idle,
> +       .pre_schedule           = pre_schedule_idle,
>  #endif
>
>         .set_curr_task          = set_curr_task_idle,
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index fc88644..9707092 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -877,6 +877,7 @@ extern const struct sched_class idle_sched_class;
>
>  extern void trigger_load_balance(struct rq *rq, int cpu);
>  extern void idle_balance(int this_cpu, struct rq *this_rq);
> +extern void idle_exit(int this_cpu, struct rq *this_rq);
>
>  #else  /* CONFIG_SMP */
>
> @@ -884,6 +885,10 @@ static inline void idle_balance(int cpu, struct rq *rq)
>  {
>  }
>
> +static inline void idle_exit(int this_cpu, struct rq *this_rq)
> +{
> +}
> +
>  #endif
>
>  extern void sysrq_sched_debug_show(void);
> --
> 1.7.9.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ