linux-kernel - Re: missing madvise functionality

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20070403144948.fe8eede6.akpm@linux-foundation.org>
Date:	Tue, 3 Apr 2007 14:49:48 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Jakub Jelinek <jakub@...hat.com>
Cc:	Ulrich Drepper <drepper@...hat.com>,
	Andi Kleen <andi@...stfloor.org>,
	Rik van Riel <riel@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	linux-mm@...ck.org, Hugh Dickins <hugh@...itas.com>
Subject: Re: missing madvise functionality

On Tue, 3 Apr 2007 16:29:37 -0400
Jakub Jelinek <jakub@...hat.com> wrote:

> On Tue, Apr 03, 2007 at 01:17:09PM -0700, Ulrich Drepper wrote:
> > Andrew Morton wrote:
> > > Ulrich, could you suggest a little test app which would demonstrate this
> > > behaviour?
> > 
> > It's not really reliably possible to demonstrate this with a small
> > program using malloc.  You'd need something like this mysql test case
> > which Rik said is not hard to run by yourself.
> > 
> > If somebody adds a kernel interface I can easily produce a glibc patch
> > so that the test can be run in the new environment.
> > 
> > But it's of course easy enough to simulate the specific problem in a
> > micro benchmark.  If you want that let me know.
> 
> I think something like following testcase which simulates what free
> and malloc do when trimming/growing a non-main arena.
> 
> My guess is that all the page zeroing is pretty expensive as well and
> takes significant time, but I haven't profiled it.
> 
> #include <pthread.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <unistd.h>
> 
> void *
> tf (void *arg)
> {
>   (void) arg;
>   size_t ps = sysconf (_SC_PAGE_SIZE);
>   void *p = mmap (NULL, 128 * ps, PROT_READ | PROT_WRITE,
>                   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>   if (p == MAP_FAILED)
>     exit (1);
>   int i;
>   for (i = 0; i < 100000; i++)
>     {
>       /* Pretend to use the buffer.  */
>       char *q, *r = (char *) p + 128 * ps;
>       size_t s;
>       for (q = (char *) p; q < r; q += ps)
>         *q = 1;
>       for (s = 0, q = (char *) p; q < r; q += ps)
>         s += *q;
>       /* Free it.  Replace this mmap with
>          madvise (p, 128 * ps, MADV_THROWAWAY) when implemented.  */
>       if (mmap (p, 128 * ps, PROT_NONE,
>                 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0) != p)
>         exit (2);
>       /* And immediately malloc again.  This would then be deleted.  */
>       if (mprotect (p, 128 * ps, PROT_READ | PROT_WRITE))
>         exit (3);
>     }
>   return NULL;
> }
> 
> int
> main (void)
> {
>   pthread_t th[32];
>   int i;
>   for (i = 0; i < 32; i++)
>     if (pthread_create (&th[i], NULL, tf, NULL))
>       exit (4);
>   for (i = 0; i < 32; i++)
>     pthread_join (th[i], NULL);
>   return 0;
> }
> 

whee.  135,000 context switches/sec on a slow 2-way.  mmap_sem, most
likely.  That is ungood.

Did anyone monitor the context switch rate with the mysql test?

Interestingly, your test app (with s/100000/1000) runs to completion in 13
seocnd on the slow 2-way.  On a fast 8-way, it took 52 seconds and
sustained 40,000 context switches/sec.  That's a bit unexpected.

Both machines show ~8% idle time, too :(

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/