lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150519113121.GD4819@pd.tnic>
Date:	Tue, 19 May 2015 13:31:21 +0200
From:	Borislav Petkov <bp@...e.de>
To:	Huang Rui <ray.huang@....com>
Cc:	Len Brown <lenb@...nel.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org,
	linux-kernel@...r.kernel.org,
	Fengguang Wu <fengguang.wu@...el.com>,
	Aaron Lu <aaron.lu@...el.com>, Tony Li <tony.li@....com>
Subject: Re: [RFC PATCH 2/4] x86, mwaitt: introduce mwaitx idle with a
 configurable timer

On Tue, May 19, 2015 at 04:01:10PM +0800, Huang Rui wrote:
> MWAITX/MWAIT does not let the cpu core go into C1 state on AMD processors.
> The cpu core still consumes less power while waiting, and has faster exit
> from waiting than "Halt". This patch implements an interface using the
> kernel parameter "idle=" to configure mwaitx type and timer value.
> 
> If "idle=mwaitx", the timeout will be set as the maximum value
> ((2^64 - 1) * TSC cycle).
> If "idle=mwaitx,100", the timeout will be set as 100ns.
> If the processor doesn't support MWAITX, then halt is used.

Ok, I see what you're trying here and I think this is not the optimal
approach.

So let me explain how I see it, you correct me if I'm wrong:

So we want to do MWAITX so that we can save us idle entry/exit overhead
with HLT. Because MWAITX is faster, reportedly.

Now, if we want to do that, we want to do it dynamically and adjust the
MWAITX sleep interval depending on the system, usage pattern, system
load and so on.

And for that we would need an adaptive scheme which approximates each
idle interval. Simply taking TSC before we enter idle and after we come
out would give us each idle residency duration and we can do some simple
math to approximate it.

Now, what would that bring us: faster wakeup times.

And here comes the 10^6 $ question: why are we doing all the fun?

I'm thinking we want to find a cutoff duration where for smaller
durations it is worth to do MWAITX and have faster entry/exit times and
for bigger durations we want to do HLT because it'll get into C1E and
give us higher power savings.

We don't want to do MWAITX too long because that'll burn more power
relatively to HLT but we don't want to do HLT for shorter periods
because then entry/exit costs.

Am I on the right track at least?

> Signed-off-by: Huang Rui <ray.huang@....com>
> ---
>  arch/x86/include/asm/mwait.h     |  2 +
>  arch/x86/include/asm/processor.h |  2 +-
>  arch/x86/kernel/process.c        | 79 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 82 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index b91136f..c4e51e7 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -14,6 +14,8 @@
>  #define CPUID5_ECX_INTERRUPT_BREAK	0x2
>  
>  #define MWAIT_ECX_INTERRUPT_BREAK	0x1
> +#define MWAITX_ECX_TIMER_ENABLE		0x2

						Use BIT(1) here.

> +#define MWAITX_EBX_WAIT_TIMEOUT		0xffffffff
>  
>  static inline void __monitor(const void *eax, unsigned long ecx,
>  			     unsigned long edx)
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 23ba676..0f60e94 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -733,7 +733,7 @@ extern unsigned long		boot_option_idle_override;
>  extern bool			amd_e400_c1e_detected;
>  
>  enum idle_boot_override {IDLE_NO_OVERRIDE=0, IDLE_HALT, IDLE_NOMWAIT,
> -			 IDLE_POLL};
> +			 IDLE_POLL, IDLE_MWAITX};
>  
>  extern void enable_sep_cpu(void);
>  extern int sysenter_setup(void);
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index 6e338e3..9d68193 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -30,6 +30,7 @@
>  #include <asm/debugreg.h>
>  #include <asm/nmi.h>
>  #include <asm/tlbflush.h>
> +#include <asm/x86_init.h>
>  
>  /*
>   * per-CPU TSS segments. Thre ads are completely 'soft' on Linux,
> @@ -276,6 +277,7 @@ unsigned long boot_option_idle_override = IDLE_NO_OVERRIDE;
>  EXPORT_SYMBOL(boot_option_idle_override);
>  
>  static void (*x86_idle)(void);
> +static unsigned long idle_param;
>  
>  #ifndef CONFIG_SMP
>  static inline void play_dead(void)
> @@ -444,6 +446,17 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
>  	return 1;
>  }
>  
> +static int not_support_mwaitx(const struct cpuinfo_x86 *c)
> +{
> +	if (c->x86_vendor != X86_VENDOR_AMD)
> +		return 1;
> +
> +	if (!cpu_has(c, X86_FEATURE_MWAITT))
> +		return 1;
> +
> +	return 0;
> +}
> +
>  /*
>   * MONITOR/MWAIT with no hints, used for default default C1 state.
>   * This invokes MWAIT with interrutps enabled and no flags,
> @@ -470,12 +483,45 @@ static void mwait_idle(void)
>  	__current_clr_polling();
>  }
>  
> +/*
> + * AMD Excavator processors support the new MONITORX/MWAITX instructions.

No need for that especially when newer than XV processors start
supporting those too.

> + * The function is similar to mwait but with a timer. On AMD platforms
> + * mwaitx does not let the core go into C1 state. This provides for a
> + * faster waiting exit speed. The user can configure the idle method and
> + * timer value via the idle kernel parameter.
> + */

...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ