lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140121221852.GT9037@kmo>
Date:	Tue, 21 Jan 2014 14:18:52 -0800
From:	Kent Overstreet <kmo@...erainc.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	target-devel <target-devel@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>, Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH-v2 1/3] percpu_ida: Make percpu_ida_alloc + callers
 accept task state bitmask

On Mon, Jan 20, 2014 at 12:34:15PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 20, 2014 at 03:44:44AM +0000, Nicholas A. Bellinger wrote:
> > From: Kent Overstreet <kmo@...erainc.com>
> > 
> > This patch changes percpu_ida_alloc() + callers to accept task state
> > bitmask for prepare_to_wait() for code like target/iscsi that needs
> > it for interruptible sleep, that is provided in a subsequent patch.
> > 
> > It now expects TASK_UNINTERRUPTIBLE when the caller is able to sleep
> > waiting for a new tag, or TASK_RUNNING when the caller cannot sleep,
> > and is forced to return a negative value when no tags are available.
> > 
> > v2 changes:
> >   - Include blk-mq + tcm_fc + vhost/scsi + target/iscsi changes
> >   - Drop signal_pending_state() call
> 
> Urgh, you made me look at percpu_ida... steal_tags() does a
> for_each_cpus() with IRQs disabled. This mean you'll disable IRQs for
> multiple ticks on SGI class hardware. That is a _very_ long time indeed.

It's not that bad in practice - the looping is limited by the number of other
CPUs that actually have tags on their freelists - i.e. the CPUs that have
recently been using that block device or whatever the percpu_ida is for. And we
loop while cpu_have_tags is greater than some threshold (there's another debate
about that) - the intention is not to steal tags unless too many other CPUs have
tags on their local freelists.

That said, for huge SGI class hardware I think you'd want the freelists to not
be percpu, but rather be per core or something - that's probably a reasonable
optimization for most hardware anyways.

> Then there's alloc_global_tags() vs alloc_local_tags(), one gets an
> actual tag, while the other only moves tags about -- semantic mismatch.

Yeah, kind of. It is doing allocation, but not the same sort of allocation.

> I do not get the comment near prepare to wait -- why does it matter if
> percpu_ida_free() flips a cpus_have_tags bit?

Did I write that comment? It is a crappy comment...

Ok, in userspace we'd be using condition variables here, but this is the kernel
so we need to carefully order putting ourselves on a waitlist, and checking the
condition that determines whether we wait, and on the wakeup end changing things
that affect that condition and doing the wakeup. steal_tags() is checking the
condition that goes with the prepare_to_wait(), that's all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ