lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5126423F.7040705@linux.vnet.ibm.com>
Date:	Thu, 21 Feb 2013 09:50:23 -0600
From:	Seth Jennings <sjenning@...ux.vnet.ibm.com>
To:	Ric Mason <ric.masonn@...il.com>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Nitin Gupta <ngupta@...are.org>,
	Minchan Kim <minchan@...nel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Dan Magenheimer <dan.magenheimer@...cle.com>,
	Robert Jennings <rcj@...ux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@...ibm.com>,
	Mel Gorman <mgorman@...e.de>,
	Johannes Weiner <jweiner@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Larry Woodman <lwoodman@...hat.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Joe Perches <joe@...ches.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, devel@...verdev.osuosl.org
Subject: Re: [PATCHv5 2/8] zsmalloc: add documentation

On 02/21/2013 02:49 AM, Ric Mason wrote:
> On 02/19/2013 03:16 AM, Seth Jennings wrote:
>> On 02/16/2013 12:21 AM, Ric Mason wrote:
>>> On 02/14/2013 02:38 AM, Seth Jennings wrote:
>>>> This patch adds a documentation file for zsmalloc at
>>>> Documentation/vm/zsmalloc.txt
>>>>
>>>> Signed-off-by: Seth Jennings <sjenning@...ux.vnet.ibm.com>
>>>> ---
>>>>    Documentation/vm/zsmalloc.txt |   68
>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>    1 file changed, 68 insertions(+)
>>>>    create mode 100644 Documentation/vm/zsmalloc.txt
>>>>
>>>> diff --git a/Documentation/vm/zsmalloc.txt
>>>> b/Documentation/vm/zsmalloc.txt
>>>> new file mode 100644
>>>> index 0000000..85aa617
>>>> --- /dev/null
>>>> +++ b/Documentation/vm/zsmalloc.txt
>>>> @@ -0,0 +1,68 @@
>>>> +zsmalloc Memory Allocator
>>>> +
>>>> +Overview
>>>> +
>>>> +zmalloc a new slab-based memory allocator,
>>>> +zsmalloc, for storing compressed pages.  It is designed for
>>>> +low fragmentation and high allocation success rate on
>>>> +large object, but <= PAGE_SIZE allocations.
>>>> +
>>>> +zsmalloc differs from the kernel slab allocator in two primary
>>>> +ways to achieve these design goals.
>>>> +
>>>> +zsmalloc never requires high order page allocations to back
>>>> +slabs, or "size classes" in zsmalloc terms. Instead it allows
>>>> +multiple single-order pages to be stitched together into a
>>>> +"zspage" which backs the slab.  This allows for higher allocation
>>>> +success rate under memory pressure.
>>>> +
>>>> +Also, zsmalloc allows objects to span page boundaries within the
>>>> +zspage.  This allows for lower fragmentation than could be had
>>>> +with the kernel slab allocator for objects between PAGE_SIZE/2
>>>> +and PAGE_SIZE.  With the kernel slab allocator, if a page compresses
>>>> +to 60% of it original size, the memory savings gained through
>>>> +compression is lost in fragmentation because another object of
>>>> +the same size can't be stored in the leftover space.
>>>> +
>>>> +This ability to span pages results in zsmalloc allocations not being
>>>> +directly addressable by the user.  The user is given an
>>>> +non-dereferencable handle in response to an allocation request.
>>>> +That handle must be mapped, using zs_map_object(), which returns
>>>> +a pointer to the mapped region that can be used.  The mapping is
>>>> +necessary since the object data may reside in two different
>>>> +noncontigious pages.
>>> Do you mean the reason of  to use a zsmalloc object must map after
>>> malloc is object data maybe reside in two different nocontiguous pages?
>> Yes, that is one reason for the mapping.  The other reason (more of an
>> added bonus) is below.
>>
>>>> +
>>>> +For 32-bit systems, zsmalloc has the added benefit of being
>>>> +able to back slabs with HIGHMEM pages, something not possible
>>> What's the meaning of "back slabs with HIGHMEM pages"?
>> By HIGHMEM, I'm referring to the HIGHMEM memory zone on 32-bit systems
>> with larger that 1GB (actually a little less) of RAM.  The upper 3GB
>> of the 4GB address space, depending on kernel build options, is not
>> directly addressable by the kernel, but can be mapped into the kernel
>> address space with functions like kmap() or kmap_atomic().
>>
>> These pages can't be used by slab/slub because they are not
>> continuously mapped into the kernel address space.  However, since
>> zsmalloc requires a mapping anyway to handle objects that span
>> non-contiguous page boundaries, we do the kernel mapping as part of
>> the process.
>>
>> So zspages, the conceptual slab in zsmalloc backed by single-order
>> pages can include pages from the HIGHMEM zone as well.
> 
> Thanks for your clarify,
>  http://lwn.net/Articles/537422/, your article about zswap in lwn.
>  "Additionally, the kernel slab allocator does not allow objects that
> are less
> than a page in size to span a page boundary. This means that if an
> object is
> PAGE_SIZE/2 + 1 bytes in size, it effectively use an entire page,
> resulting in
> ~50% waste. Hense there are *no kmalloc() cache size* between
> PAGE_SIZE/2 and
> PAGE_SIZE."
> Are your sure? It seems that kmalloc cache support big size, your can
> check in
> include/linux/kmalloc_sizes.h

Yes, kmalloc can allocate large objects > PAGE_SIZE, but there are no
cache sizes _between_ PAGE_SIZE/2 and PAGE_SIZE.  For example, on a
system with 4k pages, there are no caches between kmalloc-2048 and
kmalloc-4096.

Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ