lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080429023520.GD14990@parisc-linux.org>
Date:	Mon, 28 Apr 2008 20:35:20 -0600
From:	Matthew Wilcox <matthew@....cx>
To:	David Chinner <dgc@....com>
Cc:	linux-kernel@...r.kernel.org,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: Announce: Semaphore-Removal tree

On Tue, Apr 29, 2008 at 10:09:30AM +1000, David Chinner wrote:
> On Mon, Apr 28, 2008 at 06:20:04AM -0600, Matthew Wilcox wrote:
> > Arjan, Ingo and I have been batting around something called a kcounter.
> > I appear to have misplaced the patch right now, but the basic idea is
> > that it returns you a cookie when you down(), which you then have to
> > pass to the up()-equivalent.  This gives you at least some of the
> > assurances you get from mutexes.
> 
> <sigh>
> 
> back to the days of cookies being required for locks. We only just
> removed all the remaining lock cruft left over from Irix that used
> cookies like this. i.e.:
> 
> 	DECL_LOCK_COOKIE(cookie);
> 
> 	cookie = spin_lock(&lock);
> 	.....
> 	spin_unlock(&lock, cookie);
> 
> it's an ugly, ugly API....

Perhaps you can suggest a better one?  Our thought was that you have ...

struct xfs_inode {
	struct kcounter_t i_flock
};

struct foo {
	... other stuff you need for the io ...
	kcounter_cookie_t kct;
}

	int err = kcounter_claim(&ino->i_flock, &foo->kct);
...
	kcounter_release(&ino->i_flock, &foo->kct);

> > Though ... looking at XFS, you have 5 counting semaphores currently:
> > 
> > 1. i_flock
> > 
> > This one seems to be a mutex. 
> 
> No, it's a semaphore. It is the inode flush lock and is held over
> I/O on the inode. It is released in a different context to the
> process that holds it. We use trylock semantics on it all the time
> to determine if we can write the inode to disk.

If you're always using trylock semantics on it, then it's not really a
semaphore, is it?

> > 2. l_flushsema
> > 
> > This seems to be a completion.  ie you're using it to wait for the log
> > to be flushed.
> 
> Yes, that could probably be a completion. I'm assuming that a completion
> can handle several thousand waiting processes, right?

Up to 2 billion.

> > 3. q_flock
> > 
> > Ow.  ow.  My brain hurts.  What are these semantics?
> 
> Same semantics as the i_flock - it's held while flushing the dquot
> to disk and is released by a different thread. Trylocks are used on
> this as well...

... but not just trylocks, right?  There's a sleeping aspect to them
too.

> > 4. b_iodonesema
> > 
> > This should be a completion.  It's used to wait for the io to be
> > complete.
> 
> Yup, that could be done.
> 
> > 5. b_sema
> > 
> > This looks like a mutex, but I think it's released in a different
> > context from the one which acquires it.
> 
> Yup. held across I/O and typically released by a different thread.
> Trylock semantics used as well.

OK.

> > Possibly XFS should be using constructs like wait_on_bit instead of
> > semaphores.  See the implementation of wait_on_buffer for an example.
> 
> That sounds to me like you are saying is "semaphores are going away so
> implement your own semaphore-like thingy using some other construct".
> Right?

I don't want to say that.  People (and I'm *not* referring to XFS here)
manage to abuse semaphores in the most hideous ways.  If we tell them to
use lower-level constructs, they'll make a mess of using those too.  I
think we need to look for patterns in the semaphore users which don't
fit the mutex pattern or the completion pattern and figure out how to
satisfy those users.

> If that's the case, then AFAICT changing to completions and then
> s/semaphore/rw_semaphore/ and using only {down,up}_write() for
> the rest should work, right? Or are rwsem's going to go away, too?

I don't think there are any plans to get rid of rwsems, though the RT
people probably hate rwsems even more than they hate regular semaphores.
The mmap rwsem is a compelling argument ;-)

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ