[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <71030aef-7714-ed6d-f537-9141c7501002@huawei.com>
Date: Sun, 28 Nov 2021 17:45:40 +0800
From: Nanyong Sun <sunnanyong@...wei.com>
To: <hughd@...gle.com>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>
CC: <akpm@...ux-foundation.org>, <willy@...radead.org>,
<liam.howlett@...cle.com>, <linmiaohe@...wei.com>,
<songmuchun@...edance.com>, <chenli@...ontech.com>,
<bharata@...ux.ibm.com>, <aarcange@...hat.com>
Subject: Re: [BUG] use-after-free in ksm_might_need_to_copy with KSM and swap
Hi hugh,
Maybe this is a normal phenomenon and the kasan error can be ignored?
After analyzing the vmcore, i found that when this happen, the
page->index is not equal to linear_page_index(vma, address),
and the page is uptodate, so in ksm_might_need_to_copy, it would
continue to alloc and copy a new page.
So, although the anon_vma was freed, maybe it's a normal situation which
needs copy a ksm related page when swap in fault?
I have reviewed the history commit about ksm_might_need_to_copy, but can
not understand the code logic exactly, what
does it mean when:
page is from swapcache but not a ksm page
&& page's anon_vma is not null
&& anon_vma->root != vma->anon_vma->root or page->index !=
linear_page_index(vma, address)
Thanks.
On 2021/11/27 20:52, Nanyong Sun wrote:
> The latest release kernel v5.15.5 can also reproduce this problem, it
> seems related to KSM because
>
> we cann't reproduce this when disable KSM by "echo 0 >
> /sys/kernel/mm/ksm/run".
>
> I have analysed the vmcore and it shows that the page is in swap
> cache, its _mapcount is -1(0xffffffff).
>
> Kasan report on v5.15.5:
>
> [ 2921.508794]
> ==================================================================
> [ 2921.508799] BUG: KASAN: use-after-free in
> ksm_might_need_to_copy+0x65/0x390
> [ 2921.508809] Read of size 8 at addr ffff888bd2380690 by task CPU
> 1/KVM/101903
>
> [ 2921.508816] CPU: 12 PID: 101903 Comm: CPU 1/KVM Tainted: G S
> I 5.15.5 #1
> [ 2921.508821] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 1.09
> 01/31/2019
> [ 2921.508825] Call Trace:
> [ 2921.508828] <TASK>
> [ 2921.508830] dump_stack_lvl+0x34/0x44
> [ 2921.508839] print_address_description.constprop.0+0x1d/0xa0
> [ 2921.508852] __kasan_report.cold+0x37/0x87
> [ 2921.508870] kasan_report+0x38/0x50
> [ 2921.508876] ksm_might_need_to_copy+0x65/0x390
> [ 2921.508885] do_swap_page+0x37a/0xd40
> [ 2921.508891] __handle_mm_fault+0x8fd/0xac0
> [ 2921.508915] handle_mm_fault+0x103/0x380
> [ 2921.508920] __get_user_pages+0x2eb/0x5d0
> [ 2921.508932] get_user_pages_unlocked+0x129/0x400
> [ 2921.508950] hva_to_pfn+0x196/0x480 [kvm]
> [ 2921.509631] kvm_faultin_pfn+0x10e/0x470 [kvm]
> [ 2921.510524] direct_page_fault+0x243/0x500 [kvm]
> [ 2921.510931] kvm_mmu_page_fault+0x9c/0x260 [kvm]
> [ 2921.511153] vmx_handle_exit+0x11/0x80 [kvm_intel]
> [ 2921.511193] vcpu_enter_guest+0x1054/0x1c30 [kvm]
> [ 2921.512289] vcpu_run+0xa6/0x3a0 [kvm]
> [ 2921.512464] kvm_arch_vcpu_ioctl_run+0x112/0x390 [kvm]
> [ 2921.512638] kvm_vcpu_ioctl+0x3c6/0x860 [kvm]
> [ 2921.513180] __x64_sys_ioctl+0xb9/0xf0
> [ 2921.513185] do_syscall_64+0x5c/0x80
> [ 2921.513249] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 2921.513256] RIP: 0033:0x7f1098993527
> [ 2921.513260] Code: b3 66 90 48 8b 05 79 19 0c 00 64 c7 00 26 00 00
> 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00
> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 49 19 0c 00 f7 d8 64 89
> 01 48
> [ 2921.513265] RSP: 002b:00007f1096223de8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [ 2921.513271] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX:
> 00007f1098993527
> [ 2921.513275] RDX: 0000000000000000 RSI: 000000000000ae80 RDI:
> 0000000000000019
> [ 2921.513278] RBP: 0000000000000000 R08: 00007f1098750ab0 R09:
> 00007f10987fb300
> [ 2921.513281] R10: 0000000000000000 R11: 0000000000000246 R12:
> 000055db8febd780
> [ 2921.513284] R13: 000055db8febd81e R14: 000055db8fed47e0 R15:
> 00007ffeb0b3c140
> [ 2921.513288] </TASK>
>
> [ 2921.513292] Allocated by task 91947:
> [ 2921.513294] kasan_save_stack+0x1b/0x40
> [ 2921.513300] __kasan_slab_alloc+0x61/0x80
> [ 2921.513304] kmem_cache_alloc+0x133/0x2b0
> [ 2921.513309] __anon_vma_prepare+0x191/0x260
> [ 2921.513313] do_huge_pmd_anonymous_page+0x514/0x750
> [ 2921.513318] __handle_mm_fault+0xab7/0xac0
> [ 2921.513322] handle_mm_fault+0x103/0x380
> [ 2921.513326] __get_user_pages+0x2eb/0x5d0
> [ 2921.513331] get_user_pages_unlocked+0x129/0x400
> [ 2921.513335] hva_to_pfn+0x196/0x480 [kvm]
> [ 2921.513501] kvm_faultin_pfn+0x10e/0x470 [kvm]
> [ 2921.513682] direct_page_fault+0x243/0x500 [kvm]
> [ 2921.513864] kvm_mmu_page_fault+0x9c/0x260 [kvm]
> [ 2921.514049] vmx_handle_exit+0x11/0x80 [kvm_intel]
> [ 2921.514087] vcpu_enter_guest+0x1054/0x1c30 [kvm]
> [ 2921.514260] vcpu_run+0xa6/0x3a0 [kvm]
> [ 2921.514433] kvm_arch_vcpu_ioctl_run+0x112/0x390 [kvm]
> [ 2921.514606] kvm_vcpu_ioctl+0x3c6/0x860 [kvm]
> [ 2921.514771] __x64_sys_ioctl+0xb9/0xf0
> [ 2921.514774] do_syscall_64+0x5c/0x80
> [ 2921.514778] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> [ 2921.514785] Freed by task 504:
> [ 2921.514788] kasan_save_stack+0x1b/0x40
> [ 2921.514792] kasan_set_track+0x1c/0x30
> [ 2921.514797] kasan_set_free_info+0x20/0x30
> [ 2921.514802] __kasan_slab_free+0xeb/0x120
> [ 2921.514806] kmem_cache_free+0x8b/0x2d0
> [ 2921.514811] __put_anon_vma+0x59/0x120
> [ 2921.514814] remove_rmap_item_from_tree+0x237/0x260
> [ 2921.514818] scan_get_next_rmap_item+0x104/0x7d0
> [ 2921.514822] ksm_scan_thread+0x12a/0x480
> [ 2921.514826] kthread+0x1a7/0x1d0
> [ 2921.514832] ret_from_fork+0x22/0x30
>
> [ 2921.514839] The buggy address belongs to the object at
> ffff888bd2380690
> which belongs to the cache anon_vma of size 80
> [ 2921.514843] The buggy address is located 0 bytes inside of
> 80-byte region [ffff888bd2380690, ffff888bd23806e0)
> [ 2921.514847] The buggy address belongs to the page:
> [ 2921.514849] page:00000000fb434e9d refcount:1 mapcount:0
> mapping:0000000000000000 index:0x0 pfn:0xbd2380
> [ 2921.514854] memcg:ffff8890a291f001
> [ 2921.514856] flags:
> 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
> [ 2921.514864] raw: 0017ffffc0000200 0000000000000000 0000000500000001
> ffff888100061680
> [ 2921.514868] raw: 0000000000000000 0000000000220022 00000001ffffffff
> ffff8890a291f001
> [ 2921.514870] page dumped because: kasan: bad access detected
>
> [ 2921.514874] Memory state around the buggy address:
> [ 2921.514877] ffff888bd2380580: fc fc fc fc fa fb fb fb fb fb fb fb
> fb fb fc fc
> [ 2921.514882] ffff888bd2380600: fc fc fc fa fb fb fb fb fb fb fb fb
> fb fc fc fc
> [ 2921.514885] >ffff888bd2380680: fc fc fa fb fb fb fb fb fb fb fb fb
> fc fc fc fc
> [ 2921.514888] ^
> [ 2921.514890] ffff888bd2380700: fc fa fb fb fb fb fb fb fb fb fb fc
> fc fc fc fc
> [ 2921.514893] ffff888bd2380780: fa fb fb fb fb fb fb fb fb fb fc fc
> fc fc fc fa
> [ 2921.514896]
> ==================================================================
>
> 在 2021/11/25 15:32, Nanyong Sun 写道:
>> Hi hughd and mm experts,
>>
>> We have a problem that KASAN catches several times of use-after-free
>> in ksm_might_need_to_copy+0x12e/0x5b0,
>>
>> code is at do_swap_page -> ksm_might_need_to_copy
>>
>> struct page *ksm_might_need_to_copy(struct page *page,
>> struct vm_area_struct *vma, unsigned long
>> address)
>> {
>> struct anon_vma *anon_vma = page_anon_vma(page);
>> struct page *new_page;
>>
>> if (PageKsm(page)) {
>> if (page_stable_node(page) &&
>> !(ksm_run & KSM_RUN_UNMERGE))
>> return page; /* no need to copy it */
>> } else if (!anon_vma) {
>> return page; /* no need to copy it */
>> } else if (anon_vma->root ======>this pointer trigger the
>> use-after-free when run this line
>>
>> The anon_vma from page->mapping was freed before.
>>
>>
>> Reproduce scenario:
>>
>> Intel platform server, enable KSM and swap, with 7 virtual machines
>> repeatly do suspend and resume so that
>>
>> host will do swap out and swap in, VMs consume same content pages so
>> that host will raise KSM merging.
>>
>>
>> KASAN report:
>>
>> Log1:
>>
>> [1023457.339223]
>> ==================================================================
>> [1023457.339236] BUG: KASAN: use-after-free in
>> ksm_might_need_to_copy+0x12e/0x5b0
>> [1023457.339238] Read of size 8 at addr ffff88be9977dbd0 by task
>> khugepaged/694
>> [1023457.339239]
>> [1023457.339243] CPU: 8 PID: 694 Comm: khugepaged Kdump: loaded
>> Tainted: G OE --------- - - 4.18.0.x86_64
>> [1023457.339245] Hardware name: Huawei 1288H V5/BC11SPSC0, BIOS 7.93
>> 01/14/2021
>> [1023457.339246] Call Trace:
>> [1023457.339254] dump_stack+0xf1/0x19b
>> [1023457.339272] print_address_description+0x70/0x360
>> [1023457.339276] kasan_report+0x1b2/0x330
>> [1023457.339285] ksm_might_need_to_copy+0x12e/0x5b0
>> [1023457.339327] do_swap_page+0x452/0xe70
>> [1023457.339380] __collapse_huge_page_swapin+0x24b/0x720
>> [1023457.339410] khugepaged_scan_pmd+0xcae/0x1ff0
>> [1023457.339464] khugepaged+0x8ee/0xd70
>> [1023457.339506] kthread+0x1a2/0x1d0
>> [1023457.339511] ret_from_fork+0x1f/0x40
>> [1023457.339513]
>> [1023457.339515] Allocated by task 2306153:
>> [1023457.339517] kasan_kmalloc+0xa0/0xd0
>> [1023457.339519] kmem_cache_alloc+0xc0/0x1c0
>> [1023457.339521] anon_vma_clone+0xf7/0x380
>> [1023457.339522] anon_vma_fork+0xc0/0x390
>> [1023457.339526] copy_process+0x447b/0x4810
>> [1023457.339527] _do_fork+0x118/0x620
>> [1023457.339531] do_syscall_64+0x112/0x360
>> [1023457.339533] entry_SYSCALL_64_after_hwframe+0x65/0xca
>> [1023457.339534]
>> [1023457.339535] Freed by task 2306242:
>> [1023457.339537] __kasan_slab_free+0x130/0x180
>> [1023457.339538] kmem_cache_free+0x78/0x1d0
>> [1023457.339540] unlink_anon_vmas+0x19c/0x4a0
>> [1023457.339542] free_pgtables+0x137/0x1b0
>> [1023457.339544] exit_mmap+0x133/0x320
>> [1023457.339546] mmput+0x15e/0x390
>> [1023457.339547] do_exit+0x8c5/0x1210
>> [1023457.339549] do_group_exit+0xb5/0x1b0
>> [1023457.339550] __x64_sys_exit_group+0x21/0x30
>> [1023457.339552] do_syscall_64+0x112/0x360
>> [1023457.339554] entry_SYSCALL_64_after_hwframe+0x65/0xca
>> [1023457.339555]
>> [1023457.339557] The buggy address belongs to the object at
>> ffff88be9977dba0
>> which belongs to the cache anon_vma_chain of size 64
>> [1023457.339559] The buggy address is located 48 bytes inside of
>> 64-byte region [ffff88be9977dba0, ffff88be9977dbe0)
>> [1023457.339560] The buggy address belongs to the page:
>> [1023457.339562] page:ffffea00fa65df40 count:1 mapcount:0
>> mapping:ffff888107717800 index:0x0
>> [1023457.347802] flags: 0x17ffffc0000100(slab)
>>
>>
>> Log2:
>>
>> ==================================================================
>> BUG: KASAN: slab-out-of-bounds in ksm_might_need_to_copy+0x12e/0x5b0
>> Read of size 8 at addr ffff889e042facb0 by task CPU 1/KVM/93525
>> CPU: 8 PID: 93525 Comm: CPU 1/KVM Kdump: loaded Tainted: G O -----
>> ---- - - 4.18.0.x86_64 #1
>> Hardware name: Suma H620-G30/65N32-US, BIOS CQL1051209 05/12/2021
>> Call Trace:
>> dump_stack+0xf1/0x19b
>> print_address_description+0x70/0x360
>> kasan_report+0x1b2/0x330
>> ksm_might_need_to_copy+0x12e/0x5b0
>> do_swap_page+0x452/0xe70
>> __handle_mm_fault+0x96b/0xa20
>> handle_mm_fault+0x1bd/0x450
>> __get_user_pages+0x476/0x10e0
>> get_user_pages_unlocked+0x1e0/0x380
>> __gfn_to_pfn_memslot+0x728/0xb20 [kvm]
>> try_async_pf+0x138/0x5d0 [kvm]
>> tdp_page_fault+0x336/0x730 [kvm]
>> kvm_mmu_page_fault+0x17c/0xcd0 [kvm]
>> npf_interception+0xf4/0x200 [kvm_amd]
>> handle_exit+0x7a9/0x9a0 [kvm_amd]
>> vcpu_enter_guest+0x8eb/0x2950 [kvm]
>> kvm_arch_vcpu_ioctl_run+0x4d4/0xa30 [kvm]
>> kvm_vcpu_ioctl+0x675/0xb50 [kvm]
>> do_vfs_ioctl+0x134/0xa10
>> ksys_ioctl+0x70/0x80
>> __x64_sys_ioctl+0x3d/0x50
>> do_syscall_64+0x112/0x360
>> entry_SYSCALL_64_after_hwframe+0x65/0xca
>> RIP: 0033:0x7fa429acb527
>> Code: b3 66 90 48 8b 05 79 19 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff
>> ff ff ff c3
>> 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff
>> ff 73 01 c3
>> 48 8b 0d 49 19 0c 00 f7 d8 64 89 01 48
>> RSP: 002b:00007fa4232ecde8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fa429acb527
>> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000019
>> RBP: 0000000000000000 R08: 00007fa429888af0 R09: 0000000000000001
>> R10: 0000000000000002 R11: 0000000000000246 R12: 000055d23727b240
>> R13: 000055d23727b2de R14: 0000000000000000 R15: 0000000000000000
>>
>> Allocated by task 99792:
>> kasan_kmalloc+0xa0/0xd0
>> kmem_cache_alloc_trace+0xf3/0x1e0
>> single_open+0x36/0xe0
>> do_dentry_open+0x373/0x680
>> path_openat+0xca2/0x29d0
>> do_filp_open+0x177/0x220
>> do_sys_open+0x2d0/0x3a0
>> do_syscall_64+0x112/0x360
>> entry_SYSCALL_64_after_hwframe+0x65/0xca
>>
>> Freed by task 99792:
>> __kasan_slab_free+0x130/0x180
>> kfree+0x90/0x1b0
>> single_release+0x51/0x60
>> __fput+0x1df/0x490
>> task_work_run+0x13f/0x190
>> exit_to_usermode_loop+0x1a2/0x1b0
>> do_syscall_64+0x326/0x360
>> entry_SYSCALL_64_after_hwframe+0x65/0xca
>>
>> The buggy address belongs to the object at ffff889e042fac90
>> which belongs to the cache kmalloc-32 of size 32
>> The buggy address is located 0 bytes to the right of
>> 32-byte region [ffff889e042fac90, ffff889e042facb0)
>> The buggy address belongs to the page:
>> page:ffffea007810be80 count:1 mapcount:0 mapping:ffff888107c10580
>> index:0x0
>> flags: 0x57ffffc0000100(slab)
>> raw: 0057ffffc0000100 ffffea0077a2bd88 ffffea007857dc08 ffff888107c10580
>> raw: 0000000000000000 0000000000550055 00000001ffffffff 0000000000000000
>> page dumped because: kasan: bad access detected
>> Memory state around the buggy address:
>> ffff889e042fab80: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
>> ffff889e042fac00: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
>> >ffff889e042fac80: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb
>> ^
>> ffff889e042fad00: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
>> ffff889e042fad80: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
>> ==================================================================
>> Disabling lock debugging due to kernel taint
Powered by blists - more mailing lists