lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 18 May 2011 11:33:46 -0700
From:	"Yu, Fenghua" <fenghua.yu@...el.com>
To:	Andi Kleen <andi@...stfloor.org>
CC:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	H Peter Anvin <hpa@...or.com>,
	"Mallick, Asit K" <asit.k.mallick@...el.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Avi Kivity <avi@...hat.com>,
	Arjan van de Ven <arjan@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 9/9] x86/lib/memset_64.S: Optimize memset by enhanced
 REP MOVSB/STOSB

> -----Original Message-----
> From: Andi Kleen [mailto:andi@...stfloor.org]
> Sent: Tuesday, May 17, 2011 9:05 PM
> To: Yu, Fenghua
> Cc: Andi Kleen; Ingo Molnar; Thomas Gleixner; H Peter Anvin; Mallick,
> Asit K; Linus Torvalds; Avi Kivity; Arjan van de Ven; Andrew Morton;
> linux-kernel
> Subject: RE: [PATCH 9/9] x86/lib/memset_64.S: Optimize memset by
> enhanced REP MOVSB/STOSB
> > Only memcpy are generated by gcc when gcc version >=4.3. Other
> functions
> > are defined by kernel lib.
> 
> Are you sure? AFAIK it supports more.

I use gcc 4.3.2 installed by FC10 to build kernel with defconfig. Only memcpy is built with gcc builtin and inline memcpy. All of others (i.e. memset, clear_page, memmove, and copy_user) call the kernel lib.

It's easy to check this by disassembling kernel binary.

Gcc 4.3.2 and FC10 are old but not so old. They have this capabilities.

> 
> > I would leave gcc optimization for most memcpy cases instead of
> forcing
> > memcpy to call the kernel lib memcpy. I hope gcc will catch up and
> > implement a good enhanced rep movsb/stosb solution soon. If turns out
> gcc
> > can not generate good memcpy, it's easy to switch to the patching
> kernel
> > lib memcpy.
> 
> The problem is that gcc can only do that if you tell it to generate
> code for that. But it has no mechanism to patch in/out different
> variants for the same binary. So it would only work for a specially
> optimized kernel for that CPU.
> 
> I suspect for smaller copies it won't make too much different anyways
> and gcc's code is probably fine. But gcc won't know that you
> can do better on large copies, so using a macro would be a way
> to tell it that.
> 
> -Andi

I absolutely agree with you on that. For example, gcc builds memcpy as inlined rep movsb for big copy. This works fine on enhanced rep movsb/stosb processors. But it doesn't work as good as kernel lib memcpy on non rep movsb/stosb processors which are mostly current machine in the market.

I discussed this issue with others before. Seems people like to wait for enhanced rep movsb/stosb enabled gcc to come and see the performance data with gcc version and kernel lib version to decide which way to go.

With the patch set, at least on gcc 4.3.2, the optimization works fine except memcpy.

If people don't want to wait for gcc to optimize the mem lib with ERMS, it's easy to force those function to use lib functions. I can send a small patch in string_64/32.h to do so.

Thanks.

-Fenghua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ