lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 24 Feb 2009 15:10:32 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Salman Qazi <sqazi@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	davem@...emloft.net
Cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <andi@...stfloor.org>
Subject: Re: Performance regression in write() syscall

added some ccs

On Tuesday 24 February 2009 13:03:04 Salman Qazi wrote:
> While the introduction of __copy_from_user_nocache (see commit:
> 0812a579c92fefa57506821fa08e90f47cb6dbdd) may have been an improvement
> for sufficiently large writes, there is evidence to show that it is
> deterimental for small writes.  Unixbench's fstime test gives the
> following results for 256 byte writes with MAX_BLOCK of 2000:
>
>     2.6.29-rc6 ( 5 samples, each in KB/sec ):
>     283750, 295200, 294500, 293000, 293300
>
>     2.6.29-rc6 + this patch (5 samples, each in KB/sec):
>     313050, 3106750, 293350, 306300, 307900
>
>     2.6.18
>     395700, 342000, 399100, 366050, 359850

What does unixbench's fstime test do? If it is just writing to the
pagecache, then this would be unexpected. If it is reading and writing,
then perhaps this could be a problem, but how realistic is it for a
performance critical application to read data out of the pagecache that
it has recently written? Do you have something at google actually doing
real work that speeds up with this patch?


>     See w_test() in src/fstime.c in unixbench version 4.1.0.  Basically,
> the above test consists of counting how much we can write in this manner:
>
>     alarm(10);
>     while (!sigalarm) {
>             for (f_blocks = 0; f_blocks < 2000; ++f_blocks) {
>                    write(f, buf, 256);
>             }
>             lseek(f, 0L, 0);
>     }
>
> I realised that there are other components to the write syscall regression
> that are not addressed here.  I will send another email shortly stating the
> source of another one.
>
> Signed-off-by: Salman Qazi <sqazi@...gle.com>
> ---
> diff --git a/arch/x86/include/asm/uaccess_64.h
> b/arch/x86/include/asm/uaccess_64.h index 84210c4..efe7315 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -192,14 +192,20 @@ static inline int __copy_from_user_nocache(void *dst,
> const void __user *src, unsigned size)
>  {
>  	might_sleep();
> -	return __copy_user_nocache(dst, src, size, 1);
> +	if (likely(size >= PAGE_SIZE))
> +		return __copy_user_nocache(dst, src, size, 1);
> +	else
> +		return __copy_from_user(dst, src, size);
>  }
>
>  static inline int __copy_from_user_inatomic_nocache(void *dst,
>  						    const void __user *src,
>  						    unsigned size)
>  {
> -	return __copy_user_nocache(dst, src, size, 0);
> +	if (likely(size >= PAGE_SIZE))
> +		return __copy_user_nocache(dst, src, size, 0);
> +	else
> +		return __copy_from_user_inatomic(dst, src, size);
>  }
>
>  unsigned long


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ