lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130612191425.GA7098@redhat.com>
Date:	Wed, 12 Jun 2013 21:14:25 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Kent Overstreet <koverstreet@...gle.com>
Cc:	linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>,
	Andi Kleen <andi@...stfloor.org>, Jens Axboe <axboe@...nel.dk>,
	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>
Subject: Re: [PATCH] Percpu tag allocator

On 06/12, Kent Overstreet wrote:
>
> So we'd need at least an atomic counter, but a bitmap isn't really any
> more trouble and it lets us skip most of the percpu lists that are empty

Yes, yes, I understand.

> - which should make a real difference in scalability to huge nr_cpus.

But this is not obvious to me. I mean, I am not sure I understand why
this all is "optimal". In particular, I do not really understand
"while (cpus_have_tags-- * TAG_CPU_SIZE > pool->nr_tags / 2)" in
steal_tags(), even if "the workload should not span more cpus than
nr_tags / 128" is true. I guess this connects to "we guarantee that
nr_tags / 2 can always be allocated" and we do not want to call
steal_tags() too often/otherwise, but cpus_have_tags * TAG_CPU_SIZE
can easily overestimate the number of free tags.

But I didn't read the patch carefully, and it is not that I think I
can suggest something better.

In short: please ignore ;)

> > > +enum {
> > > +	TAG_FAIL	= -1U,
> > > +	TAG_MAX		= TAG_FAIL - 1,
> > > +};
> >
> > This can probably go to .c, and it seems that TAG_MAX is not needed.
> > tag_init() can check "nr_tags >= TAG_FAIL" instead.
>
> Users need TAG_FAIL to check for allocation failure.

Ah, indeed, !__GFP_WAIT...

Hmm. but why gfp_t? why not "bool atomic" ?

Probably this is because you expect that most callers should have
gfp anyway. Looks a bit strange but I won't argue.

> > > +		if (nr_free) {
> > > +			memcpy(tags->freelist,
> > > +			       remote->freelist,
> > > +			       sizeof(unsigned) * nr_free);
> > > +			smp_mb();
> > > +			remote->nr_free = 0;
> > > +			tags->nr_free = nr_free;
> > > +			return true;
> > > +		} else {
> > > +			remote->nr_free = 0;
> > > +		}
> >
> > Both branches clear remote->nr_free.
>
> Yeah, but clearing it has to happen after the memcpy() and the smp_mb().

Yes, yes, we need mb() between memcpy() and "remote = 0",

> I couldn't find a way to combine them that I liked, but if you've got
> any suggestions I'm all ears.

Please ignore. Somehow I missed the fact we need to return or continue,
so we need "else" or another check.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ