lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170719221313.GB92176@dennisz-mbp.dhcp.thefacebook.com>
Date:   Wed, 19 Jul 2017 18:13:14 -0400
From:   Dennis Zhou <dennisz@...com>
To:     Josef Bacik <josef@...icpanda.com>
CC:     Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
        <kernel-team@...com>, <linux-kernel@...r.kernel.org>,
        <linux-mm@...ck.org>, Dennis Zhou <dennisszhou@...il.com>
Subject: Re: [PATCH 09/10] percpu: replace area map allocator with bitmap
 allocator

Hi Josef,

Thanks for taking a look at my code.

On Wed, Jul 19, 2017 at 07:16:35PM +0000, Josef Bacik wrote:
> 
> Actually I decided I do want to complain about this.  Have you considered making
> chunks statically sized, like slab does?  We could avoid this whole bound_map
> thing completely and save quite a few cycles trying to figure out how big our
> allocation was.  Thanks,

I did consider something along the lines of a slab allocator, but
ultimately utilization and fragmentation were why I decided against it.

Percpu memory is handled by giving each cpu its own copy of the object
to use. This means cpus can avoid cache coherence when accessing and
manipulating the object. To do this, the percpu allocator creates chunks
to serve each allocation out of. Because each cpu has its own copy, there
is a high cost for having each chunk lying around (and this memory in
general).

With slab allocation, it takes liberty in caching often used sizes and
accepting internal fragmentation for performance. Unfortunately, the
percpu memory allocator does not necessarily know what is going to get
allocated. It would need to keep many slabs around to serve each
allocation which can be quite expensive. In the worst-case, long living
percpu allocations can keep entire slabs alive as there is no way to
perform consolidation once addresses are given out. Additionally, any
internal fragmentation caused by ill-fit objects is amplified by the
number of possible cpus.

Thanks,
Dennis

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ