lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtDz1-fR2i5x7AVd8sGZhX2VNtz2rnQC3ePrcHO9MWK3FQ@mail.gmail.com>
Date:	Tue, 15 May 2012 14:57:46 +0200
From:	Vincent Guittot <vincent.guittot@...aro.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	paulmck@...ux.vnet.ibm.com, smuckle@...cinc.com, khilman@...com,
	Robin.Randhawa@....com, suresh.b.siddha@...el.com,
	thebigcorporation@...il.com, venki@...gle.com,
	panto@...oniou-consulting.com, mingo@...e.hu, paul.brett@...el.com,
	pdeschrijver@...dia.com, pjt@...gle.com, efault@....de,
	fweisbec@...il.com, geoff@...radead.org, rostedt@...dmis.org,
	tglx@...utronix.de, amit.kucheria@...aro.org,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linaro-sched-sig@...ts.linaro.org,
	Morten Rasmussen <Morten.Rasmussen@....com>,
	Juri Lelli <juri.lelli@...il.com>
Subject: Re: Plumbers: Tweaking scheduler policy micro-conf RFP

On 15 May 2012 14:23, Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, 2012-05-15 at 10:02 +0200, Vincent Guittot wrote:
>>
>> Would you like to present the ongoing work around the load balance
>> policy and the replacement for sched_mc during the scheduler
>> micro-conf ?
>
> Not sure there's much to say that isn't already said..
>
> As it stands nobody cares (as evident by the total lack of progress
> since the last time this all came up), so I've just queued the below
> patch.

Not sure that nobody cares but it's much more that scheduler,
load_balance and sched_mc are sensible enough that it's difficult to
ensure that a modification will not break everything for someone else.

>
>
> ---
> Subject: sched: Remove all power aware scheduling
> From: Peter Zijlstra <peterz@...radead.org>
> Date: Mon, 09 Jan 2012 11:28:35 +0100
>
> Its been broken forever and nobody cares enough to fix it proper..
> remove it.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> ---
>  Documentation/ABI/testing/sysfs-devices-system-cpu |   25 -
>  Documentation/scheduler/sched-domains.txt          |    4
>  arch/x86/kernel/smpboot.c                          |    3
>  drivers/base/cpu.c                                 |    4
>  include/linux/cpu.h                                |    2
>  include/linux/sched.h                              |   47 ---
>  include/linux/topology.h                           |    5
>  kernel/sched/core.c                                |   94 -------
>  kernel/sched/fair.c                                |  278 ---------------------
>  tools/power/cpupower/man/cpupower-set.1            |    9
>  tools/power/cpupower/utils/helpers/sysfs.c         |   35 --
>  11 files changed, 4 insertions(+), 502 deletions(-)
>
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -9,31 +9,6 @@ Contact:       Linux kernel mailing list <linu
>
>                /sys/devices/system/cpu/cpu#/
>
> -What:          /sys/devices/system/cpu/sched_mc_power_savings
> -               /sys/devices/system/cpu/sched_smt_power_savings
> -Date:          June 2006
> -Contact:       Linux kernel mailing list <linux-kernel@...r.kernel.org>
> -Description:   Discover and adjust the kernel's multi-core scheduler support.
> -
> -               Possible values are:
> -
> -               0 - No power saving load balance (default value)
> -               1 - Fill one thread/core/package first for long running threads
> -               2 - Also bias task wakeups to semi-idle cpu package for power
> -                   savings
> -
> -               sched_mc_power_savings is dependent upon SCHED_MC, which is
> -               itself architecture dependent.
> -
> -               sched_smt_power_savings is dependent upon SCHED_SMT, which
> -               is itself architecture dependent.
> -
> -               The two files are independent of each other. It is possible
> -               that one file may be present without the other.
> -
> -               Introduced by git commit 5c45bf27.
> -
> -
>  What:          /sys/devices/system/cpu/kernel_max
>                /sys/devices/system/cpu/offline
>                /sys/devices/system/cpu/online
> --- a/Documentation/scheduler/sched-domains.txt
> +++ b/Documentation/scheduler/sched-domains.txt
> @@ -61,10 +61,6 @@ might have just one domain covering its
>  struct sched_domain fields, SD_FLAG_*, SD_*_INIT to get an idea of
>  the specifics and what to tune.
>
> -For SMT, the architecture must define CONFIG_SCHED_SMT and provide a
> -cpumask_t cpu_sibling_map[NR_CPUS], where cpu_sibling_map[i] is the mask of
> -all "i"'s siblings as well as "i" itself.
> -
>  Architectures may retain the regular override the default SD_*_INIT flags
>  while using the generic domain builder in kernel/sched.c if they wish to
>  retain the traditional SMT->SMP->NUMA topology (or some subset of that). This
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -413,8 +413,7 @@ const struct cpumask *cpu_coregroup_mask
>         * For perf, we return last level cache shared map.
>         * And for power savings, we return cpu_core_map
>         */
> -       if ((sched_mc_power_savings || sched_smt_power_savings) &&
> -           !(cpu_has(c, X86_FEATURE_AMD_DCM)))
> +       if (!(cpu_has(c, X86_FEATURE_AMD_DCM)))
>                return cpu_core_mask(cpu);
>        else
>                return cpu_llc_shared_mask(cpu);
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -330,8 +330,4 @@ void __init cpu_dev_init(void)
>                panic("Failed to register CPU subsystem");
>
>        cpu_dev_register_generic();
> -
> -#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
> -       sched_create_sysfs_power_savings_entries(cpu_subsys.dev_root);
> -#endif
>  }
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -36,8 +36,6 @@ extern void cpu_remove_dev_attr(struct d
>  extern int cpu_add_dev_attr_group(struct attribute_group *attrs);
>  extern void cpu_remove_dev_attr_group(struct attribute_group *attrs);
>
> -extern int sched_create_sysfs_power_savings_entries(struct device *dev);
> -
>  #ifdef CONFIG_HOTPLUG_CPU
>  extern void unregister_cpu(struct cpu *cpu);
>  extern ssize_t arch_cpu_probe(const char *, size_t);
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -855,61 +855,14 @@ enum cpu_idle_type {
>  #define SD_WAKE_AFFINE         0x0020  /* Wake task to waking CPU */
>  #define SD_PREFER_LOCAL                0x0040  /* Prefer to keep tasks local to this domain */
>  #define SD_SHARE_CPUPOWER      0x0080  /* Domain members share cpu power */
> -#define SD_POWERSAVINGS_BALANCE        0x0100  /* Balance for power savings */
>  #define SD_SHARE_PKG_RESOURCES 0x0200  /* Domain members share cpu pkg resources */
>  #define SD_SERIALIZE           0x0400  /* Only a single load balancing instance */
>  #define SD_ASYM_PACKING                0x0800  /* Place busy groups earlier in the domain */
>  #define SD_PREFER_SIBLING      0x1000  /* Prefer to place tasks in a sibling domain */
>  #define SD_OVERLAP             0x2000  /* sched_domains of this level overlap */
>
> -enum powersavings_balance_level {
> -       POWERSAVINGS_BALANCE_NONE = 0,  /* No power saving load balance */
> -       POWERSAVINGS_BALANCE_BASIC,     /* Fill one thread/core/package
> -                                        * first for long running threads
> -                                        */
> -       POWERSAVINGS_BALANCE_WAKEUP,    /* Also bias task wakeups to semi-idle
> -                                        * cpu package for power savings
> -                                        */
> -       MAX_POWERSAVINGS_BALANCE_LEVELS
> -};
> -
> -extern int sched_mc_power_savings, sched_smt_power_savings;
> -
> -static inline int sd_balance_for_mc_power(void)
> -{
> -       if (sched_smt_power_savings)
> -               return SD_POWERSAVINGS_BALANCE;
> -
> -       if (!sched_mc_power_savings)
> -               return SD_PREFER_SIBLING;
> -
> -       return 0;
> -}
> -
> -static inline int sd_balance_for_package_power(void)
> -{
> -       if (sched_mc_power_savings | sched_smt_power_savings)
> -               return SD_POWERSAVINGS_BALANCE;
> -
> -       return SD_PREFER_SIBLING;
> -}
> -
>  extern int __weak arch_sd_sibiling_asym_packing(void);
>
> -/*
> - * Optimise SD flags for power savings:
> - * SD_BALANCE_NEWIDLE helps aggressive task consolidation and power savings.
> - * Keep default SD flags if sched_{smt,mc}_power_saving=0
> - */
> -
> -static inline int sd_power_saving_flags(void)
> -{
> -       if (sched_mc_power_savings | sched_smt_power_savings)
> -               return SD_BALANCE_NEWIDLE;
> -
> -       return 0;
> -}
> -
>  struct sched_group_power {
>        atomic_t ref;
>        /*
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -98,7 +98,6 @@ int arch_update_cpu_topology(void);
>                                | 0*SD_BALANCE_WAKE                     \
>                                | 1*SD_WAKE_AFFINE                      \
>                                | 1*SD_SHARE_CPUPOWER                   \
> -                               | 0*SD_POWERSAVINGS_BALANCE             \
>                                | 1*SD_SHARE_PKG_RESOURCES              \
>                                | 0*SD_SERIALIZE                        \
>                                | 0*SD_PREFER_SIBLING                   \
> @@ -134,8 +133,6 @@ int arch_update_cpu_topology(void);
>                                | 0*SD_SHARE_CPUPOWER                   \
>                                | 1*SD_SHARE_PKG_RESOURCES              \
>                                | 0*SD_SERIALIZE                        \
> -                               | sd_balance_for_mc_power()             \
> -                               | sd_power_saving_flags()               \
>                                ,                                       \
>        .last_balance           = jiffies,                              \
>        .balance_interval       = 1,                                    \
> @@ -167,8 +164,6 @@ int arch_update_cpu_topology(void);
>                                | 0*SD_SHARE_CPUPOWER                   \
>                                | 0*SD_SHARE_PKG_RESOURCES              \
>                                | 0*SD_SERIALIZE                        \
> -                               | sd_balance_for_package_power()        \
> -                               | sd_power_saving_flags()               \
>                                ,                                       \
>        .last_balance           = jiffies,                              \
>        .balance_interval       = 1,                                    \
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5920,8 +5920,6 @@ static const struct cpumask *cpu_cpu_mas
>        return cpumask_of_node(cpu_to_node(cpu));
>  }
>
> -int sched_smt_power_savings = 0, sched_mc_power_savings = 0;
> -
>  struct sd_data {
>        struct sched_domain **__percpu sd;
>        struct sched_group **__percpu sg;
> @@ -6313,7 +6311,6 @@ sd_numa_init(struct sched_domain_topolog
>                                        | 0*SD_WAKE_AFFINE
>                                        | 0*SD_PREFER_LOCAL
>                                        | 0*SD_SHARE_CPUPOWER
> -                                       | 0*SD_POWERSAVINGS_BALANCE
>                                        | 0*SD_SHARE_PKG_RESOURCES
>                                        | 1*SD_SERIALIZE
>                                        | 0*SD_PREFER_SIBLING
> @@ -6810,97 +6807,6 @@ void partition_sched_domains(int ndoms_n
>        mutex_unlock(&sched_domains_mutex);
>  }
>
> -#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
> -static void reinit_sched_domains(void)
> -{
> -       get_online_cpus();
> -
> -       /* Destroy domains first to force the rebuild */
> -       partition_sched_domains(0, NULL, NULL);
> -
> -       rebuild_sched_domains();
> -       put_online_cpus();
> -}
> -
> -static ssize_t sched_power_savings_store(const char *buf, size_t count, int smt)
> -{
> -       unsigned int level = 0;
> -
> -       if (sscanf(buf, "%u", &level) != 1)
> -               return -EINVAL;
> -
> -       /*
> -        * level is always be positive so don't check for
> -        * level < POWERSAVINGS_BALANCE_NONE which is 0
> -        * What happens on 0 or 1 byte write,
> -        * need to check for count as well?
> -        */
> -
> -       if (level >= MAX_POWERSAVINGS_BALANCE_LEVELS)
> -               return -EINVAL;
> -
> -       if (smt)
> -               sched_smt_power_savings = level;
> -       else
> -               sched_mc_power_savings = level;
> -
> -       reinit_sched_domains();
> -
> -       return count;
> -}
> -
> -#ifdef CONFIG_SCHED_MC
> -static ssize_t sched_mc_power_savings_show(struct device *dev,
> -                                          struct device_attribute *attr,
> -                                          char *buf)
> -{
> -       return sprintf(buf, "%u\n", sched_mc_power_savings);
> -}
> -static ssize_t sched_mc_power_savings_store(struct device *dev,
> -                                           struct device_attribute *attr,
> -                                           const char *buf, size_t count)
> -{
> -       return sched_power_savings_store(buf, count, 0);
> -}
> -static DEVICE_ATTR(sched_mc_power_savings, 0644,
> -                  sched_mc_power_savings_show,
> -                  sched_mc_power_savings_store);
> -#endif
> -
> -#ifdef CONFIG_SCHED_SMT
> -static ssize_t sched_smt_power_savings_show(struct device *dev,
> -                                           struct device_attribute *attr,
> -                                           char *buf)
> -{
> -       return sprintf(buf, "%u\n", sched_smt_power_savings);
> -}
> -static ssize_t sched_smt_power_savings_store(struct device *dev,
> -                                           struct device_attribute *attr,
> -                                            const char *buf, size_t count)
> -{
> -       return sched_power_savings_store(buf, count, 1);
> -}
> -static DEVICE_ATTR(sched_smt_power_savings, 0644,
> -                  sched_smt_power_savings_show,
> -                  sched_smt_power_savings_store);
> -#endif
> -
> -int __init sched_create_sysfs_power_savings_entries(struct device *dev)
> -{
> -       int err = 0;
> -
> -#ifdef CONFIG_SCHED_SMT
> -       if (smt_capable())
> -               err = device_create_file(dev, &dev_attr_sched_smt_power_savings);
> -#endif
> -#ifdef CONFIG_SCHED_MC
> -       if (!err && mc_capable())
> -               err = device_create_file(dev, &dev_attr_sched_mc_power_savings);
> -#endif
> -       return err;
> -}
> -#endif /* CONFIG_SCHED_MC || CONFIG_SCHED_SMT */
> -
>  /*
>  * Update cpusets according to cpu_active mask.  If cpusets are
>  * disabled, cpuset_update_active_cpus() becomes a simple wrapper
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2721,7 +2721,7 @@ select_task_rq_fair(struct task_struct *
>                 * If power savings logic is enabled for a domain, see if we
>                 * are not overloaded, if so, don't balance wider.
>                 */
> -               if (tmp->flags & (SD_POWERSAVINGS_BALANCE|SD_PREFER_LOCAL)) {
> +               if (tmp->flags & (SD_PREFER_LOCAL)) {
>                        unsigned long power = 0;
>                        unsigned long nr_running = 0;
>                        unsigned long capacity;
> @@ -2734,9 +2734,6 @@ select_task_rq_fair(struct task_struct *
>
>                        capacity = DIV_ROUND_CLOSEST(power, SCHED_POWER_SCALE);
>
> -                       if (tmp->flags & SD_POWERSAVINGS_BALANCE)
> -                               nr_running /= 2;
> -
>                        if (nr_running < capacity)
>                                want_sd = 0;
>                }
> @@ -3435,14 +3432,6 @@ struct sd_lb_stats {
>        unsigned int  busiest_group_weight;
>
>        int group_imb; /* Is there imbalance in this sd */
> -#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
> -       int power_savings_balance; /* Is powersave balance needed for this sd */
> -       struct sched_group *group_min; /* Least loaded group in sd */
> -       struct sched_group *group_leader; /* Group which relieves group_min */
> -       unsigned long min_load_per_task; /* load_per_task in group_min */
> -       unsigned long leader_nr_running; /* Nr running of group_leader */
> -       unsigned long min_nr_running; /* Nr running of group_min */
> -#endif
>  };
>
>  /*
> @@ -3486,147 +3475,6 @@ static inline int get_sd_load_idx(struct
>        return load_idx;
>  }
>
> -
> -#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
> -/**
> - * init_sd_power_savings_stats - Initialize power savings statistics for
> - * the given sched_domain, during load balancing.
> - *
> - * @sd: Sched domain whose power-savings statistics are to be initialized.
> - * @sds: Variable containing the statistics for sd.
> - * @idle: Idle status of the CPU at which we're performing load-balancing.
> - */
> -static inline void init_sd_power_savings_stats(struct sched_domain *sd,
> -       struct sd_lb_stats *sds, enum cpu_idle_type idle)
> -{
> -       /*
> -        * Busy processors will not participate in power savings
> -        * balance.
> -        */
> -       if (idle == CPU_NOT_IDLE || !(sd->flags & SD_POWERSAVINGS_BALANCE))
> -               sds->power_savings_balance = 0;
> -       else {
> -               sds->power_savings_balance = 1;
> -               sds->min_nr_running = ULONG_MAX;
> -               sds->leader_nr_running = 0;
> -       }
> -}
> -
> -/**
> - * update_sd_power_savings_stats - Update the power saving stats for a
> - * sched_domain while performing load balancing.
> - *
> - * @group: sched_group belonging to the sched_domain under consideration.
> - * @sds: Variable containing the statistics of the sched_domain
> - * @local_group: Does group contain the CPU for which we're performing
> - *             load balancing ?
> - * @sgs: Variable containing the statistics of the group.
> - */
> -static inline void update_sd_power_savings_stats(struct sched_group *group,
> -       struct sd_lb_stats *sds, int local_group, struct sg_lb_stats *sgs)
> -{
> -
> -       if (!sds->power_savings_balance)
> -               return;
> -
> -       /*
> -        * If the local group is idle or completely loaded
> -        * no need to do power savings balance at this domain
> -        */
> -       if (local_group && (sds->this_nr_running >= sgs->group_capacity ||
> -                               !sds->this_nr_running))
> -               sds->power_savings_balance = 0;
> -
> -       /*
> -        * If a group is already running at full capacity or idle,
> -        * don't include that group in power savings calculations
> -        */
> -       if (!sds->power_savings_balance ||
> -               sgs->sum_nr_running >= sgs->group_capacity ||
> -               !sgs->sum_nr_running)
> -               return;
> -
> -       /*
> -        * Calculate the group which has the least non-idle load.
> -        * This is the group from where we need to pick up the load
> -        * for saving power
> -        */
> -       if ((sgs->sum_nr_running < sds->min_nr_running) ||
> -           (sgs->sum_nr_running == sds->min_nr_running &&
> -            group_first_cpu(group) > group_first_cpu(sds->group_min))) {
> -               sds->group_min = group;
> -               sds->min_nr_running = sgs->sum_nr_running;
> -               sds->min_load_per_task = sgs->sum_weighted_load /
> -                                               sgs->sum_nr_running;
> -       }
> -
> -       /*
> -        * Calculate the group which is almost near its
> -        * capacity but still has some space to pick up some load
> -        * from other group and save more power
> -        */
> -       if (sgs->sum_nr_running + 1 > sgs->group_capacity)
> -               return;
> -
> -       if (sgs->sum_nr_running > sds->leader_nr_running ||
> -           (sgs->sum_nr_running == sds->leader_nr_running &&
> -            group_first_cpu(group) < group_first_cpu(sds->group_leader))) {
> -               sds->group_leader = group;
> -               sds->leader_nr_running = sgs->sum_nr_running;
> -       }
> -}
> -
> -/**
> - * check_power_save_busiest_group - see if there is potential for some power-savings balance
> - * @env: load balance environment
> - * @sds: Variable containing the statistics of the sched_domain
> - *     under consideration.
> - *
> - * Description:
> - * Check if we have potential to perform some power-savings balance.
> - * If yes, set the busiest group to be the least loaded group in the
> - * sched_domain, so that it's CPUs can be put to idle.
> - *
> - * Returns 1 if there is potential to perform power-savings balance.
> - * Else returns 0.
> - */
> -static inline
> -int check_power_save_busiest_group(struct lb_env *env, struct sd_lb_stats *sds)
> -{
> -       if (!sds->power_savings_balance)
> -               return 0;
> -
> -       if (sds->this != sds->group_leader ||
> -                       sds->group_leader == sds->group_min)
> -               return 0;
> -
> -       env->imbalance = sds->min_load_per_task;
> -       sds->busiest = sds->group_min;
> -
> -       return 1;
> -
> -}
> -#else /* CONFIG_SCHED_MC || CONFIG_SCHED_SMT */
> -static inline void init_sd_power_savings_stats(struct sched_domain *sd,
> -       struct sd_lb_stats *sds, enum cpu_idle_type idle)
> -{
> -       return;
> -}
> -
> -static inline void update_sd_power_savings_stats(struct sched_group *group,
> -       struct sd_lb_stats *sds, int local_group, struct sg_lb_stats *sgs)
> -{
> -       return;
> -}
> -
> -static inline
> -int check_power_save_busiest_group(struct lb_env *env, struct sd_lb_stats *sds)
> -{
> -       return 0;
> -}
> -#endif /* CONFIG_SCHED_MC || CONFIG_SCHED_SMT */
> -
> -
>  unsigned long default_scale_freq_power(struct sched_domain *sd, int cpu)
>  {
>        return SCHED_POWER_SCALE;
> @@ -3932,7 +3780,6 @@ static inline void update_sd_lb_stats(st
>        if (child && child->flags & SD_PREFER_SIBLING)
>                prefer_sibling = 1;
>
> -       init_sd_power_savings_stats(env->sd, sds, env->idle);
>        load_idx = get_sd_load_idx(env->sd, env->idle);
>
>        do {
> @@ -3981,7 +3828,6 @@ static inline void update_sd_lb_stats(st
>                        sds->group_imb = sgs.group_imb;
>                }
>
> -               update_sd_power_savings_stats(sg, sds, local_group, &sgs);
>                sg = sg->next;
>        } while (sg != env->sd->groups);
>  }
> @@ -4278,12 +4124,6 @@ find_busiest_group(struct lb_env *env, c
>        return sds.busiest;
>
>  out_balanced:
> -       /*
> -        * There is no obvious imbalance. But check if we can do some balancing
> -        * to save power.
> -        */
> -       if (check_power_save_busiest_group(env, &sds))
> -               return sds.busiest;
>  ret:
>        env->imbalance = 0;
>        return NULL;
> @@ -4361,28 +4201,6 @@ static int need_active_balance(struct lb
>                 */
>                if ((sd->flags & SD_ASYM_PACKING) && env->src_cpu > env->dst_cpu)
>                        return 1;
> -
> -               /*
> -                * The only task running in a non-idle cpu can be moved to this
> -                * cpu in an attempt to completely freeup the other CPU
> -                * package.
> -                *
> -                * The package power saving logic comes from
> -                * find_busiest_group(). If there are no imbalance, then
> -                * f_b_g() will return NULL. However when sched_mc={1,2} then
> -                * f_b_g() will select a group from which a running task may be
> -                * pulled to this cpu in order to make the other package idle.
> -                * If there is no opportunity to make a package idle and if
> -                * there are no imbalance, then f_b_g() will return NULL and no
> -                * action will be taken in load_balance_newidle().
> -                *
> -                * Under normal task pull operation due to imbalance, there
> -                * will be more than one task in the source run queue and
> -                * move_tasks() will succeed.  ld_moved will be true and this
> -                * active balance code will not be triggered.
> -                */
> -               if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP)
> -                       return 0;
>        }
>
>        return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2);
> @@ -4704,104 +4522,10 @@ static struct {
>        unsigned long next_balance;     /* in jiffy units */
>  } nohz ____cacheline_aligned;
>
> -#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
> -/**
> - * lowest_flag_domain - Return lowest sched_domain containing flag.
> - * @cpu:       The cpu whose lowest level of sched domain is to
> - *             be returned.
> - * @flag:      The flag to check for the lowest sched_domain
> - *             for the given cpu.
> - *
> - * Returns the lowest sched_domain of a cpu which contains the given flag.
> - */
> -static inline struct sched_domain *lowest_flag_domain(int cpu, int flag)
> -{
> -       struct sched_domain *sd;
> -
> -       for_each_domain(cpu, sd)
> -               if (sd->flags & flag)
> -                       break;
> -
> -       return sd;
> -}
> -
> -/**
> - * for_each_flag_domain - Iterates over sched_domains containing the flag.
> - * @cpu:       The cpu whose domains we're iterating over.
> - * @sd:                variable holding the value of the power_savings_sd
> - *             for cpu.
> - * @flag:      The flag to filter the sched_domains to be iterated.
> - *
> - * Iterates over all the scheduler domains for a given cpu that has the 'flag'
> - * set, starting from the lowest sched_domain to the highest.
> - */
> -#define for_each_flag_domain(cpu, sd, flag) \
> -       for (sd = lowest_flag_domain(cpu, flag); \
> -               (sd && (sd->flags & flag)); sd = sd->parent)
> -
> -/**
> - * find_new_ilb - Finds the optimum idle load balancer for nomination.
> - * @cpu:       The cpu which is nominating a new idle_load_balancer.
> - *
> - * Returns:    Returns the id of the idle load balancer if it exists,
> - *             Else, returns >= nr_cpu_ids.
> - *
> - * This algorithm picks the idle load balancer such that it belongs to a
> - * semi-idle powersavings sched_domain. The idea is to try and avoid
> - * completely idle packages/cores just for the purpose of idle load balancing
> - * when there are other idle cpu's which are better suited for that job.
> - */
> -static int find_new_ilb(int cpu)
> -{
> -       int ilb = cpumask_first(nohz.idle_cpus_mask);
> -       struct sched_group *ilbg;
> -       struct sched_domain *sd;
> -
> -       /*
> -        * Have idle load balancer selection from semi-idle packages only
> -        * when power-aware load balancing is enabled
> -        */
> -       if (!(sched_smt_power_savings || sched_mc_power_savings))
> -               goto out_done;
> -
> -       /*
> -        * Optimize for the case when we have no idle CPUs or only one
> -        * idle CPU. Don't walk the sched_domain hierarchy in such cases
> -        */
> -       if (cpumask_weight(nohz.idle_cpus_mask) < 2)
> -               goto out_done;
> -
> -       rcu_read_lock();
> -       for_each_flag_domain(cpu, sd, SD_POWERSAVINGS_BALANCE) {
> -               ilbg = sd->groups;
> -
> -               do {
> -                       if (ilbg->group_weight !=
> -                               atomic_read(&ilbg->sgp->nr_busy_cpus)) {
> -                               ilb = cpumask_first_and(nohz.idle_cpus_mask,
> -                                                       sched_group_cpus(ilbg));
> -                               goto unlock;
> -                       }
> -
> -                       ilbg = ilbg->next;
> -
> -               } while (ilbg != sd->groups);
> -       }
> -unlock:
> -       rcu_read_unlock();
> -
> -out_done:
> -       if (ilb < nr_cpu_ids && idle_cpu(ilb))
> -               return ilb;
> -
> -       return nr_cpu_ids;
> -}
> -#else /*  (CONFIG_SCHED_MC || CONFIG_SCHED_SMT) */
>  static inline int find_new_ilb(int call_cpu)
>  {
>        return nr_cpu_ids;
>  }
> -#endif
>
>  /*
>  * Kick a CPU to do the nohz balancing, if it is time for it. We pick the
> --- a/tools/power/cpupower/man/cpupower-set.1
> +++ b/tools/power/cpupower/man/cpupower-set.1
> @@ -85,15 +85,6 @@ Adjust the kernel's multi-core scheduler
>  savings
>  .RE
>
> -sched_mc_power_savings is dependent upon SCHED_MC, which is
> -itself architecture dependent.
> -
> -sched_smt_power_savings is dependent upon SCHED_SMT, which
> -is itself architecture dependent.
> -
> -The two files are independent of each other. It is possible
> -that one file may be present without the other.
> -
>  .SH "SEE ALSO"
>  cpupower-info(1), cpupower-monitor(1), powertop(1)
>  .PP
> --- a/tools/power/cpupower/utils/helpers/sysfs.c
> +++ b/tools/power/cpupower/utils/helpers/sysfs.c
> @@ -362,22 +362,7 @@ char *sysfs_get_cpuidle_driver(void)
>  */
>  int sysfs_get_sched(const char *smt_mc)
>  {
> -       unsigned long value;
> -       char linebuf[MAX_LINE_LEN];
> -       char *endp;
> -       char path[SYSFS_PATH_MAX];
> -
> -       if (strcmp("mc", smt_mc) && strcmp("smt", smt_mc))
> -               return -EINVAL;
> -
> -       snprintf(path, sizeof(path),
> -               PATH_TO_CPU "sched_%s_power_savings", smt_mc);
> -       if (sysfs_read_file(path, linebuf, MAX_LINE_LEN) == 0)
> -               return -1;
> -       value = strtoul(linebuf, &endp, 0);
> -       if (endp == linebuf || errno == ERANGE)
> -               return -1;
> -       return value;
> +       return -ENODEV;
>  }
>
>  /*
> @@ -388,21 +373,5 @@ int sysfs_get_sched(const char *smt_mc)
>  */
>  int sysfs_set_sched(const char *smt_mc, int val)
>  {
> -       char linebuf[MAX_LINE_LEN];
> -       char path[SYSFS_PATH_MAX];
> -       struct stat statbuf;
> -
> -       if (strcmp("mc", smt_mc) && strcmp("smt", smt_mc))
> -               return -EINVAL;
> -
> -       snprintf(path, sizeof(path),
> -               PATH_TO_CPU "sched_%s_power_savings", smt_mc);
> -       sprintf(linebuf, "%d", val);
> -
> -       if (stat(path, &statbuf) != 0)
> -               return -ENODEV;
> -
> -       if (sysfs_write_file(path, linebuf, MAX_LINE_LEN) == 0)
> -               return -1;
> -       return 0;
> +       return -ENODEV;
>  }
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ