lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111227220753.GH17712@google.com>
Date:	Tue, 27 Dec 2011 14:07:53 -0800
From:	Tejun Heo <tj@...nel.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Vivek Goyal <vgoyal@...hat.com>, avi@...hat.com, nate@...nel.net,
	cl@...ux-foundation.org, oleg@...hat.com, axboe@...nel.dk,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCHSET] block, mempool, percpu: implement percpu mempool
 and fix blkcg percpu alloc deadlock

Hello,

On Tue, Dec 27, 2011 at 01:25:01PM -0800, Andrew Morton wrote:
> umm, we've already declared that it is OK to completely waste this
> memory for the users (probably a majority) who will not be using
> these stats.

We're talking about combinatorial combinations where only small subset
is usually expected to be used and, in addition to the absolute usage,
there's big advantage in showing behavior which users would expect.
If 1000 cgroups are doing IOs to 1000 devices, it's expected to
consume some amount of resource.

The whole io_context / blk_cgroup - request_queue association
mechanism is based on opportunistic allocation.  It might not be the
prettiest thing in the world but given the circumstances IMHO the
approach fits the constraints defined by the problem.

Given the restricted nature of percpu allocation, it would be nice to
punt it to GFP_KERNEL context *somewhere* and for block layer that
somewhere probably can only be userland access.  I just don't see that
fitting better here.  The suggested alternative seems much nastier
with userland visible side effects and possibility for combinatorial
increase in memory usage for something as benign as single cat of stat
files.

Also, such erratic userland visible behavior is deviation from the
current one and at the same time we would be bound to the
idiosyncracies later when we can improve the implementation.

I can't see how that can be a better tradeoff.  It shifts the problem
to even more cumbersome corner.

> Also, has this stuff been tested at that scale?  I fear to think what
> 10000 allocations will do to fragmetnation of the vmalloc() arena.

Percpu allocator doesn't use vmalloc directly.  It maps address ranges
(which is at least 32k and usually much larger) from vmalloc space and
allocate it using simplistic extent allocator.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ