linux-kernel - Re: [PATCH] mm: hugetlb: independent PMD page table shared count

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8e59d2bd-77d3-41bc-83b7-532b018db4e2@redhat.com>
Date: Mon, 16 Dec 2024 16:34:21 +0100
From: David Hildenbrand <david@...hat.com>
To: Liu Shixin <liushixin2@...wei.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Muchun Song <muchun.song@...ux.dev>,
 Kenneth W Chen <kenneth.w.chen@...el.com>,
 Kefeng Wang <wangkefeng.wang@...wei.com>, Nanyong Sun <sunnanyong@...wei.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: hugetlb: independent PMD page table shared count

On 14.12.24 11:44, Liu Shixin wrote:
> The folio refcount may be increased unexpectly through try_get_folio() by
> caller such as split_huge_pages. In huge_pmd_unshare(), we use refcount to
> check whether a pmd page table is shared. The check is incorrect if the
> refcount is increased by the above caller, and this can cause the page
> table leaked:

Are you sure it is "leaked" ?

I assume what happens is that we end up freeing a page table without 
calling its constructor. That's why page freeing code complains about 
"nonzero mapcount" (overlayed by something else).

 > >   BUG: Bad page state in process sh  pfn:109324
>   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x66 pfn:0x109324
>   flags: 0x17ffff800000000(node=0|zone=2|lastcpupid=0xfffff)
>   page_type: f2(table)
>   raw: 017ffff800000000 0000000000000000 0000000000000000 0000000000000000
>   raw: 0000000000000066 0000000000000000 00000000f2000000 0000000000000000
>   page dumped because: nonzero mapcount
>   ...
>   CPU: 31 UID: 0 PID: 7515 Comm: sh Kdump: loaded Tainted: G    B              6.13.0-rc2master+ #7
>   Tainted: [B]=BAD_PAGE
>   Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
>   Call trace:
>    show_stack+0x20/0x38 (C)
>    dump_stack_lvl+0x80/0xf8
>    dump_stack+0x18/0x28
>    bad_page+0x8c/0x130
>    free_page_is_bad_report+0xa4/0xb0
>    free_unref_page+0x3cc/0x620
>    __folio_put+0xf4/0x158
>    split_huge_pages_all+0x1e0/0x3e8
>    split_huge_pages_write+0x25c/0x2d8
>    full_proxy_write+0x64/0xd8
>    vfs_write+0xcc/0x280
>    ksys_write+0x70/0x110
>    __arm64_sys_write+0x24/0x38
>    invoke_syscall+0x50/0x120
>    el0_svc_common.constprop.0+0xc8/0xf0
>    do_el0_svc+0x24/0x38
>    el0_svc+0x34/0x128
>    el0t_64_sync_handler+0xc8/0xd0
>    el0t_64_sync+0x190/0x198
> 
> The issue may be triggered by damon, offline_page, page_idle etc. which
> will increase the refcount of page table.

Right, many do have a racy folio_test_lru() check in there, that 
prevents "most of the harm", but not all of them.


-- 
Cheers,

David / dhildenb