lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAOm6qn=FVeTpH54wGDFMHuCOeYtvoTx30ktnv9-w3Nh8RMofEA@mail.gmail.com>
Date: Wed, 25 Dec 2024 16:09:12 +0800
From: Weikang Guo <guoweikang.kernel@...il.com>
To: Catalin Marinas <catalin.marinas@....com>, Andrew Morton <akpm@...ux-foundation.org>
Cc: Linux Memory Management List <linux-mm@...ck.org>, linux-kernel@...r.kernel.org
Subject: Potential Double Scanning of struct page Physical Memory by kmemleak

Dear Catalin and Andrew,

I hope this message finds you well.

While reviewing the implementation of kmemleak, I noticed a potential
issue where the physical memory occupied by struct page may be scanned
twice by kmemleak. To verify this hypothesis, I added some logging to
the code, and I would like to discuss my findings with you. Below are
the details of my environment, the logic I used, and the relevant log
output.

Test Platform:
Architecture: ARM64
Platform: qemu
branch: local branch  based on V6.12
defconfig: use defconfig with CONFIG_KMEMLEAK open

Problem Description:

When CONFIG_SPARSEMEM is enabled, the memory for `struct page`objects
in the sections is allocated via `sparse_buffer` using the
`memblock_alloc_xxx` interface. Since memblock does not explicitly
specify `MEMBLOCK_ALLOC_NOLEAKTRACE`, kmemleak will treat the memory
as a gray object with mincount=0. In other words, the physical memory
occupied by `struct page` will be scanned by kmemleak as a gray object
during the scan thread.

Additionally, kmemleak also traverses and scans valid struct page
objects in the zone . As a result, the physical memory occupied by
struct page may end up being scanned twice by scan thread.

Logging Added:

To investigate further, I added the following logs:

Logs for the memory allocated for `struct page` objects through sparse memblock.

diff --git a/mm/sparse.c b/mm/sparse.c
index 3174b3ca3e9e..f2465b394344 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -362,6 +362,9 @@ static void __init sparse_buffer_init(unsigned
long size, int nid)
         */
        sparsemap_buf = memmap_alloc(size, section_map_size(), addr, nid, true);
        sparsemap_buf_end = sparsemap_buf + size;
+       pr_info("sparse buf alloc VA: [0x%08lx: 0x%08lx] to PA: \
+               [0x%08lx: 0x%08lx]\n",sparsemap_buf, sparsemap_buf_end,
+               virt_to_phys(sparsemap_buf), virt_to_phys(sparsemap_buf_end));
 #ifndef CONFIG_SPARSEMEM_VMEMMAP
        memmap_boot_pages_add(DIV_ROUND_UP(size, PAGE_SIZE));
 #endif

Logs to show the mapping between VMEMAP VA and PA.

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index c0388b2e959d..f1a312f2a281 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -342,6 +342,8 @@ int __meminit vmemmap_populate_hugepages(unsigned
long start, unsigned long end,

                        p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
                        if (p) {
+                               pr_info("==== vmemap 2M: VA:0x%08lx to
 PA:0x%08lx \n",
+                                       start, virt_to_phys(p));
                                vmemmap_set_pmd(pmd, p, node, addr, next);
                                continue;
                        } else if (altmap) {

Logs showing the memblock and struct page objects scanned by kmemleak.

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 0efa41d77bcc..a803825dcb18 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1529,7 +1529,10 @@ static void scan_object(struct kmemleak_object *object)
                                (void *)object->pointer;
                void *end = start + object->size;
                void *next;
-
+                if (object->flags & OBJECT_PHYS) {
+                 pr_info("======= scan memblock
OBJECT_PHYS:[0x%08lx:0x%08lx]\n",
+                        start, end);
+               }
                do {
                        next = min(start + MAX_SCAN_SIZE, end);
                        scan_block(start, next, object);
@@ -1677,6 +1680,8 @@ static void kmemleak_scan(void)
         * Struct page scanning for each node.
         */
        for_each_populated_zone(zone) {
+               static int print_scan_page = 0;
+
                unsigned long start_pfn = zone->zone_start_pfn;
                unsigned long end_pfn = zone_end_pfn(zone);
                unsigned long pfn;
@@ -1696,6 +1701,14 @@ static void kmemleak_scan(void)
                        /* only scan if page is in use */
                        if (page_count(page) == 0)
                                continue;
+
+
+                       if (print_scan_page == 0) {
+                               pr_info("======= scan page
[0x%08lx:0x%08lx] \n",
+                       page, page+1);
+                               print_scan_page++;
+                       }
+
                        scan_block(page, page + 1, NULL);
                }
        }

The log output is as follows:

[    0.000000] sparse buf alloc VA: [0xffff39253b600000:
0xffff39253f600000] to PA: [0x13b600000: 0x13f600000]
[    0.000000] ==== vmemap 2M: VA:0xfffffee451000000 to  PA:0x13b600000
[    0.000000] ==== vmemap 2M: VA:0xfffffee451200000 to  PA:0x13b800000
[    0.000000] ==== vmemap 2M: VA:0xfffffee451400000 to  PA:0x13ba00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee451600000 to  PA:0x13bc00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee451800000 to  PA:0x13be00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee451a00000 to  PA:0x13c000000
[    0.000000] ==== vmemap 2M: VA:0xfffffee451c00000 to  PA:0x13c200000
[    0.000000] ==== vmemap 2M: VA:0xfffffee451e00000 to  PA:0x13c400000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452000000 to  PA:0x13c600000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452200000 to  PA:0x13c800000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452400000 to  PA:0x13ca00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452600000 to  PA:0x13cc00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452800000 to  PA:0x13ce00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452a00000 to  PA:0x13d000000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452c00000 to  PA:0x13d200000
[    0.000000] ==== vmemap 2M: VA:0xfffffee452e00000 to  PA:0x13d400000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453000000 to  PA:0x13d600000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453200000 to  PA:0x13d800000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453400000 to  PA:0x13da00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453600000 to  PA:0x13dc00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453800000 to  PA:0x13de00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453a00000 to  PA:0x13e000000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453c00000 to  PA:0x13e200000
[    0.000000] ==== vmemap 2M: VA:0xfffffee453e00000 to  PA:0x13e400000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454000000 to  PA:0x13e600000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454200000 to  PA:0x13e800000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454400000 to  PA:0x13ea00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454600000 to  PA:0x13ec00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454800000 to  PA:0x13ee00000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454a00000 to  PA:0x13f000000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454c00000 to  PA:0x13f200000
[    0.000000] ==== vmemap 2M: VA:0xfffffee454e00000 to  PA:0x13f400000


[   62.480979] kmemleak: ======= scan page
[0xfffffee451008400:0xfffffee451008440]
[   62.705876] kmemleak: ======= scan memblock
OBJECT_PHYS:[0xffff39253b600000:0xffff39253f600000]

scan_page: VMEMAP VA is [ 0xfffffee451008400 -  0xfffffee451008440]
scan_page:Real   PA is      [0x13b608400 - 0x13b608440]
scan OBJECT_PHYS: VA is [0xffff39253b600000:0xffff39253f600000]
scan OBJECT_PHYS: PA is [0x13b600000:0x13f600000]

The first record page  `0x13b608400`  was scanned twice

Could you please help confirm whether this issue exists and if it
could lead to any unintended consequences? I would appreciate any
insights or suggestions you may have.

Possible solution: Specify `MEMBLOCK_ALLOC_NOLEAKTRACE` when alloc
`struct page memory`

Thank you for your time and assistance.

Best regards,
Guo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ