lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 29 Jul 2020 11:34:23 +0800
From:   Jia He <justin.he@....com>
To:     Dan Williams <dan.j.williams@...el.com>,
        Vishal Verma <vishal.l.verma@...el.com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        David Hildenbrand <david@...hat.com>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Dave Jiang <dave.jiang@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Steve Capper <steve.capper@....com>,
        Mark Rutland <mark.rutland@....com>,
        Logan Gunthorpe <logang@...tatee.com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Hsin-Yi Wang <hsinyi@...omium.org>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Kees Cook <keescook@...omium.org>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-nvdimm@...ts.01.org, linux-mm@...ck.org,
        Wei Yang <richardw.yang@...ux.intel.com>,
        Pankaj Gupta <pankaj.gupta.linux@...il.com>,
        Ira Weiny <ira.weiny@...el.com>, Kaly Xin <Kaly.Xin@....com>,
        Jia He <justin.he@....com>
Subject: [RFC PATCH 5/6] device-dax: relax the memblock size alignment for kmem_start

Previously, kmem_start in dev_dax_kmem_probe should be aligned with
SECTION_SIZE_BITS(30), i.e. 1G memblock size on arm64. Even with Dan
Williams' sub-section patch series, it was not helpful when adding the
dax pmem kmem to memblock:
$ndctl create-namespace -e namespace0.0 --mode=devdax --map=dev -s 2g -f -a 2M
$echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
$echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
$cat /proc/iomem
...
23c000000-23fffffff : System RAM
  23dd40000-23fecffff : reserved
  23fed0000-23fffffff : reserved
240000000-33fdfffff : Persistent Memory
  240000000-2403fffff : namespace0.0
  280000000-2bfffffff : dax0.0          <- boundary are aligned with 1G
    280000000-2bfffffff : System RAM (kmem)
$ lsmem
RANGE                                 SIZE  STATE REMOVABLE BLOCK
0x0000000040000000-0x000000023fffffff   8G online       yes   1-8
0x0000000280000000-0x00000002bfffffff   1G online       yes    10

Memory block size:         1G
Total online memory:       9G
Total offline memory:      0B
...
Hence there is a big gap between 0x2403fffff and 0x280000000 due to the 1G
alignment on arm64. More than that, only 1G memory is returned while 2G is
requested.

On x86, the gap is relatively small due to SECTION_SIZE_BITS(27).

Besides descreasing SECTION_SIZE_BITS on arm64, we can relax the alignment
when adding the kmem.
After this patch:
240000000-33fdfffff : Persistent Memory
  240000000-2421fffff : namespace0.0
  242400000-2bfffffff : dax0.0
    242400000-2bfffffff : System RAM (kmem)
$ lsmem
RANGE                                 SIZE  STATE REMOVABLE BLOCK
0x0000000040000000-0x00000002bfffffff  10G online       yes  1-10

Memory block size:         1G
Total online memory:      10G
Total offline memory:      0B

Notes, block 9-10 are the newly hotplug added.

This patches remove the tight alignment constraint of
memory_block_size_bytes(), but still keep the constraint from
online_pages_range().

Signed-off-by: Jia He <justin.he@....com>
---
 drivers/dax/kmem.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index d77786dc0d92..849d0706dfe0 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -30,9 +30,20 @@ int dev_dax_kmem_probe(struct device *dev)
 	const char *new_res_name;
 	int numa_node;
 	int rc;
+	int order;
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+	/* kmem_start needn't be aligned with memory_block_size_bytes().
+	 * But given the constraint in online_pages_range(), adjust the
+	 * alignment of kmem_start and kmem_size
+	 */
+	kmem_size = resource_size(res);
+	order = min_t(int, MAX_ORDER - 1, get_order(kmem_size));
+	kmem_start = ALIGN(res->start, 1ul << (order + PAGE_SHIFT));
+	/* Adjust the size down to compensate for moving up kmem_start: */
+	kmem_size -= kmem_start - res->start;
+	/* Align the size down to cover only complete blocks: */
+	kmem_size &= ~((1ul << (order + PAGE_SHIFT)) - 1);
+	kmem_end = kmem_start + kmem_size;
 
 	/*
 	 * Ensure good NUMA information for the persistent memory.
@@ -48,13 +59,6 @@ int dev_dax_kmem_probe(struct device *dev)
 			numa_node, res);
 	}
 
-	kmem_size = resource_size(res);
-	/* Adjust the size down to compensate for moving up kmem_start: */
-	kmem_size -= kmem_start - res->start;
-	/* Align the size down to cover only complete blocks: */
-	kmem_size &= ~(memory_block_size_bytes() - 1);
-	kmem_end = kmem_start + kmem_size;
-
 	new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
 	if (!new_res_name)
 		return -ENOMEM;
-- 
2.17.1

Powered by blists - more mailing lists