lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19526.1175777338@redhat.com>
Date:	Thu, 05 Apr 2007 13:48:58 +0100
From:	David Howells <dhowells@...hat.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Jakub Jelinek <jakub@...hat.com>,
	Ulrich Drepper <drepper@...hat.com>,
	Andi Kleen <andi@...stfloor.org>,
	Rik van Riel <riel@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	linux-mm@...ck.org, Hugh Dickins <hugh@...itas.com>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: preemption and rwsems (was: Re: missing madvise functionality) 

Andrew Morton <akpm@...ux-foundation.org> wrote:

> 
> What we effectively have is 32 threads on a single CPU all doing
> 
> 	for (ever) {
> 		down_write()
> 		up_write()
> 		down_read()
> 		up_read();
> 	}

That's not quite so.  In that test program, most loops do two d/u writes and
then a slew of d/u reads with virtually no delay between them.  One of the
write-locked periods possibly lasts a relatively long time (it frees a bunch
of pages), and the read-locked periods last a potentially long time (have to
allocate a page).

Though, to be fair, as long as you've got way more than 16MB of RAM, the
memory stuff shouldn't take too long, but the locks will be being held for a
long time compared to the periods when you're not holding a lock of any sort.

> and rwsems are "fair".

If they weren't, you'd have to expect writer starvation in this situation.  As
it is, you're guaranteed progress on all threads.

> CONFIG_PREEMPT_VOLUNTARY=y

Which means the periods of lock-holding can be extended by preemption of the
lock holder(s), making the whole situation that much worse.  You have to
remember, you *can* be preempted whilst you hold a semaphore, rwsem or mutex.

> I run it all on a single CPU under `taskset -c 0' on the 8-way and it still
> causes 160,000 context switches per second and takes 9.5 seconds (after
> s/100000/1000).

How about if you have a UP kernel?  (ie: spinlocks -> nops)

> the context switch rate falls to zilch and total runtime falls to 6.4
> seconds.

I presume you don't mean literally zero.

> If that cond_resched() was not there, none of this would ever happen - each
> thread merrily chugs away doing its ups and downs until it expires its
> timeslice.  Interesting, in a sad sort of way.

The trouble is, I think, that you spend so much more time holding (or
attempting to hold) locks than not, and preemption just exacerbates things.

I suspect that the reason the problem doesn't seem so obvious when you've got
8 CPUs crunching their way through at once is probably because you can make
progress on several read loops simultaneously fast enough that the preemption
is lost in the things having to stop to give everyone writelocks.

But short of recording the lock sequence, I don't think there's anyway to find
out for sure.  printk probably won't cut it as a recording mechanism because
its overheads are too great.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ