linux-kernel - Re: 【BUG】NULL pointer dereference at __lookup_swap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e132246-96be-a281-78f4-8310f75a0ed8@huawei.com>
Date:   Wed, 30 Nov 2022 09:31:56 +0800
From:   xialonglong <xialonglong1@...wei.com>
To:     "Huang, Ying" <ying.huang@...el.com>
CC:     <linux-kernel@...r.kernel.org>, <hannes@...xchg.org>,
        <linux-mm@...ck.org>, <mhocko@...nel.org>,
        <roman.gushchin@...ux.dev>, <shakeelb@...gle.com>,
        "Wangkefeng (OS Kernel Lab)" <wangkefeng.wang@...wei.com>,
        chenwandun <chenwandun@...wei.com>, <songmuchun@...edance.com>,
        <gregkh@...uxfoundation.org>
Subject: Re: 【BUG】NULL pointer dereference at __lookup_swap_cgroup

Thank you very much for your reply  :)
Inspired by your reply，we successfully reproduced the bug.

The test steps:
1.swapon  /dev/zram0
2.add some memory pressure by stress-ng
3.calling swapoff /dev/zram0 in the do_swap_page function (this changed 
the source code)
4.bug occured in the same place.

After testing, this patch solves the bug.
Finally, there is a small question. Why linux5.10 revert this patch 
(2799e77529c2)?

We found that to fix this bug, the following patches may be required:
efa33fc7f6e mm/shmem: fix shmem_swapin() race with swapoff
5c046235a826 mm/swap: remove confusing checking for non_swap_entry() in 
swap_ra_info()
2799e77529c2 swap: fix do_swap_page() race with swapoff
63d8620ecf93 mm/swapfile: use percpu_ref to serialize against concurrent 
swapoff
seem like all this patchset is needed except commit 5c046235a826 
("mm/swap: remove confusing checking for non_swap_entry() in 
swap_ra_info()")

Best Regards,
Xia, longlong

在 2022/11/28 9:08, Huang, Ying 写道:
> Hi,
>
> xialonglong <xialonglong1@...wei.com> writes:
>
>> A panic occur in the linux 5.10.we meet it only once.it seems that
>> there is no special changes between 5.10 and upsteam about swap_cgroup.
>>
>> The test is based on QEMU with 64GB memory, one 2GB zram device as
>> swap area.
>> The test steps:
>> 1.swapoff -a
>> 2.add some memory pressure by stress-ng
>> 3.while (2 minutes) {
>>   swapoff /dev/zram0
>>   swapon /dev/zram0
>>   sleep 3
>> }
>> 4. swapon -a
>>
>> Preliminary analysis showed that the swap entry point to a swap area
>> which have already been swapoff, and no other obvious clues, still
>> trying to reproduce it.
> We have a patch as follows to fix a similar issue,
>
> 2799e77529c2a25492a4395db93996e3dacd762d
> Author:     Miaohe Lin <linmiaohe@...wei.com>
> AuthorDate: Mon Jun 28 19:36:50 2021 -0700
> Commit:     Linus Torvalds <torvalds@...ux-foundation.org>
> CommitDate: Tue Jun 29 10:53:49 2021 -0700
>
> swap: fix do_swap_page() race with swapoff
>
> When I was investigating the swap code, I found the below possible race
> window:
>
> CPU 1                                   	CPU 2
> -----                                   	-----
> do_swap_page
>    if (data_race(si->flags & SWP_SYNCHRONOUS_IO)
>    swap_readpage
>      if (data_race(sis->flags & SWP_FS_OPS)) {
>                                          	swapoff
> 					  	  ..
> 					  	  p->swap_file = NULL;
> 					  	  ..
>      struct file *swap_file = sis->swap_file;
>      struct address_space *mapping = swap_file->f_mapping;[oops!]
>
> Note that for the pages that are swapped in through swap cache, this isn't
> an issue. Because the page is locked, and the swap entry will be marked
> with SWAP_HAS_CACHE, so swapoff() can not proceed until the page has been
> unlocked.
>
> Fix this race by using get/put_swap_device() to guard against concurrent
> swapoff.
>
> Can you check whether that can fix your issue?
>
> Best Regards,
> Huang, Ying
>
>> Any known issue about this feature, or any advise will be appreciated.
>>
>> Here are the panic log,
>>
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 0000000000000740
>> Mem abort info:
>> ESR = 0x96000004
>> EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0,
>> S1PTW = 0 Data abort info:
>> ISV = 0, ISS = 0x00000004
>> CM = 0, WnR = 0
>> user pgtable: 4k pages, 48-bit VAs, pgdp=000000010ae6e000
>> pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops:
>> 96000004 [#1] SMP Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
>> 02/06/2015
>> pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
>> pc : lookup_swap_cgroup_id+0x38/0x50
>> lr : mem_cgroup_charge+0x9c/0x424
>> sp : ffff800102f63bc0
>> x29: ffff800102f63bc0 x28: ffff0000d0d64d00
>> x27: 0000000000000000 x26: 0000000000000007
>> x25: ffff0000018c86a8 x24: ffff0000018c8640
>> x23: 0000000000000cc0 x22: 0000000000000001
>> x21: 0000000000000001 x20: ffff800102f63d28
>> x19: fffffe000373cb40 x18: 0000000000000000
>> x17: 0000000000000000 x16: ffff8001004715a4
>> x15: 00000000ffffffff x14: 0000000000003000
>> x13: 00000000ffffffff x12: 0000000000000040
>> x11: ffff0000c0403478 x10: ffff0000c040347a
>> x9 : ffff8001003e957c x8 : 000000000009dddd
>> x7 : 0000000000000600 x6 : 00000000000000e8
>> x5 : 0000020000200000 x4 : ffff000000000000
>> x3 : ffff800101f4c030 x2 : 0000000000000000
>> x1 : 00000000000001e4 x0 : 0000000000000000
>>
>> Call trace:
>> lookup_swap_cgroup_id+0x38/0x50
>> do_swap_page+0xa64/0xc04
>> handle_pte_fault+0x1c8/0x214
>> __handle_mm_fault+0x1b0/0x380
>> handle_mm_fault+0xf4/0x284
>> do_page_fault+0x188/0x474
>> do_translation_fault+0xb8/0xe4
>> do_mem_abort+0x48/0xb0
>> el0_da+0x44/0x80
>> el0_sync_handler+0x88/0xb4
>> el0_sync+0x160/0x180
>>
>> <lookup_swap_cgroup_id>:?????? mov?? x9, x30
>> <lookup_swap_cgroup_id+0x4>:???? nop
>> <lookup_swap_cgroup_id+0x8>:????
>> lsr?? x2, x0, #58 SWP_TYPE_SHIFT == 58? x2 =
>> swp_type
>> <lookup_swap_cgroup_id+0xc>:????
>> adrp? x1, 0xffff800101f4c000
>> <memcg_sockets_enabled_key+0x8>
>> <lookup_swap_cgroup_id+0x10>:???
>> add?? x3, x1, #0x30????
>> x3 == swap_cgroup_ctrl
>> <lookup_swap_cgroup_id+0x14>:??? ubfx? x6, x0, #11, #47
>> <lookup_swap_cgroup_id+0x18>:??? add?? x2, x2, x2, lsl #1
>> <lookup_swap_cgroup_id+0x1c>:??? ubfiz? x1, x0, #1, #11
>> <lookup_swap_cgroup_id+0x20>:???
>> mov?? x5,
>> #0x200000?????????
>> // #2097152
>> <lookup_swap_cgroup_id+0x24>:???
>> mov?? x4,
>> #0xffff000000000000???? //
>> #-281474976710656
>> <lookup_swap_cgroup_id+0x28>:??? movk? x5, #0x200, lsl #32
>> <lookup_swap_cgroup_id+0x2c>:??? hint? #0x19
>> <lookup_swap_cgroup_id+0x30>:???
>> ldr?? x0, [x3,x2,lsl #3] x3=ffff800101f4c030, x0 = 0
>> <lookup_swap_cgroup_id+0x34>:??? hint? #0x1d
>> <lookup_swap_cgroup_id+0x38>:???
>> ldr?? x0, [x0,x6,lsl #3] x0 = 0 + 0xe8 * 8 == 0x740
>> <lookup_swap_cgroup_id+0x3c>:??? add?? x0, x0, x5
>> <lookup_swap_cgroup_id+0x40>:??? lsr?? x0, x0, #6
>> <lookup_swap_cgroup_id+0x44>:??? add?? x0, x1, x0, lsl #12
>> <lookup_swap_cgroup_id+0x48>:??? ldrh? w0, [x0,x4]
>> <lookup_swap_cgroup_id+0x4c>:??? ret