[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0804030050290.15850@artax.karlin.mff.cuni.cz>
Date: Thu, 3 Apr 2008 00:53:14 +0200 (CEST)
From: Mikulas Patocka <mikulas@...ax.karlin.mff.cuni.cz>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: Andrew Morton <akpm@...ux-foundation.org>, viro@...iv.linux.org.uk,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH]: Fix SMP-reordering race in mark_buffer_dirty
On Wed, 2 Apr 2008, Linus Torvalds wrote:
>
>
> On Wed, 2 Apr 2008, Andrew Morton wrote:
> >
> > But then the test-and-set of an already-set flag would newly cause the
> > cacheline to be dirtied, requiring additional bus usage to write it back?
> >
> > The CPU's test-and-set-bit operation could of course optimise that away in
> > this case. But does it?
>
> No, afaik no current x86 uarch will optimize away the write on a locked
> instuction if it turns out to be unnecessary.
No, it doesn't. Try this:
#include <string.h>
#include <pthread.h>
void *pth(void *p)
{
int i;
for (i = 0; i < 100000000; i++)
__asm__ volatile ("lock;btsl $0, %0"::"m"(*(int
*)p):"cc");
return NULL;
}
int args[2000];
int main(void)
{
pthread_t t1, t2, t3, t4;
memset(args, -1, sizeof args);
pthread_create(&t1, NULL, pth, &args[0]);
pthread_create(&t2, NULL, pth, &args[16]);
pthread_create(&t3, NULL, pth, &args[32]);
pthread_create(&t4, NULL, pth, &args[48]);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
pthread_join(t3, NULL);
pthread_join(t4, NULL);
return 0;
}
--- when the &args[] indices are in a conflicting cacheline, I get 9 times
slower execution. I tried it on 2 double-core Core 2 Xeons.
Mikulas
> Can somebody find a timing reason to have the ugly code?
>
> Linus
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists