linux-kernel - Re: [PATCHv2] zram: free secondary algorithms names

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF8kJuMQqib-2qhjNTFzKdUdJoUhTOuhv5GtgLnSsUStMHbwdw@mail.gmail.com>
Date: Tue, 24 Sep 2024 08:52:32 -0700
From: Chris Li <chriscli@...gle.com>
To: Sergey Senozhatsky <senozhatsky@...omium.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Minchan Kim <minchan@...nel.org>, 
	linux-kernel@...r.kernel.org, linux-mm <linux-mm@...ck.org>, 
	Kairui Song <ryncsn@...il.com>
Subject: Re: [PATCHv2] zram: free secondary algorithms names

Hi Sergey,

On Tue, Sep 24, 2024 at 8:36 AM Chris Li <chrisl@...nel.org> wrote:
>
> Hi Sergey,
>
> The current mm-unstable is breaking my swap stress test again. While there seems to be multiple bad commits that cause it. I have bisected into this commit causing kernel warning and followed by BUG().
>
> [   56.630032] zswap: loaded using pool lzo/zsmalloc
> [   56.718027] zram0: detected capacity change from 16777216 to 0
> [   56.725492] zram: Removed device: zram0
> [   56.740125] ------------[ cut here ]------------
> [   56.744616] WARNING: CPU: 2 PID: 1894 at mm/slub.c:4556 free_large_kmalloc+0x4d/0x80
> [   56.745119] Modules linked in:
> [   56.749551] CPU: 2 UID: 0 PID: 1894 Comm: zram-generator Tainted: G S                 6.11.0-rc6+ #33
> [   56.750129] Tainted: [S]=CPU_OUT_OF_SPEC
> [   56.750908] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/21/2023
> [   56.751354] RIP: 0010:free_large_kmalloc+0x4d/0x80
> [   56.756120] Code: 00 10 00 00 48 d3 e0 f7 d8 81 e2 c0 00 00 00 75 2f 89 c6 48 89 df e8 82 ff ff ff f0 ff 4b 34 0f 85 e
> 9 7d f5 00 e9 eb 7d f5 00 <0f> 0b 80 3d a8 f3 9b 02 00 0f 84 bd 7d f5 00 b8 00 f0 ff ff eb d1
> [   56.761370] RSP: 0018:ffffaeaaa3657b20 EFLAGS: 00010246
> [   56.761676] RAX: 0057ffffc0002000 RBX: ffffece0c1f40e80 RCX: 000000008040003f
> [   56.766293] RDX: ffffece0c1f40e88 RSI: ffffffff9a03a131 RDI: ffffece0c1f40e80
> [   56.770931] RBP: 0000000000200000 R08: ffff95571d256480 R09: 000000008040003f
> [   56.775540] R10: 000000008040003f R11: 000000000000032c R12: 0000000000200000
> [   56.780212] R13: ffff953787c71e40 R14: 0000000000000047 R15: ffff95379b2e3e20
> [   56.784943] FS:  00007fb0f1d58bc0(0000) GS:ffff95567ed00000(0000) knlGS:0000000000000000
> [   56.785403] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   56.789937] CR2: 00007f35b6449050 CR3: 00000001112ac006 CR4: 00000000001706f0
> [   56.794784] Call Trace:
> [   56.794941]  <TASK>
> [   56.799377]  ? free_large_kmalloc+0x4d/0x80
> [   56.799598]  ? __warn.cold+0x8e/0xe8
> [   56.799842]  ? free_large_kmalloc+0x4d/0x80
> [   56.800065]  ? report_bug+0xff/0x140
> [   56.800296]  ? handle_bug+0x3c/0x80
> [   56.804703]  ? exc_invalid_op+0x17/0x70
> [   56.804912]  ? asm_exc_invalid_op+0x1a/0x20
> [   56.805132]  ? free_large_kmalloc+0x4d/0x80
> [   56.805344]  zram_destroy_comps+0x32/0x70
> [   56.805568]  zram_reset_device+0x102/0x190
> [   56.805812]  reset_store+0xa6/0x110
> [   56.810207]  kernfs_fop_write_iter+0x141/0x1f0
> [   56.814689]  vfs_write+0x294/0x460
> [   56.819106]  ksys_write+0x6d/0xf0
> [   56.823550]  do_syscall_64+0x82/0x160
> [   56.823827]  ? __pfx_kfree_link+0x10/0x10
> [   56.824051]  ? do_sys_openat2+0x9c/0xe0
> [   56.824263]  ? __handle_mm_fault+0xb34/0xfb0
> [   56.828752]  ? syscall_exit_to_user_mode+0x10/0x220
> [   56.833220]  ? do_syscall_64+0x8e/0x160
> [   56.833429]  ? __count_memcg_events+0x77/0x130
> [   56.838021]  ? count_memcg_events.constprop.0+0x1a/0x30
> [   56.838318]  ? handle_mm_fault+0x1bb/0x2c0
> [   56.838542]  ? do_user_addr_fault+0x55a/0x7b0
> [   56.843014]  ? exc_page_fault+0x7e/0x180
> [   56.843228]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   56.843831] RIP: 0033:0x7fb0f1f7a984
> [   56.844045] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 06 0e 00 00 7
> 4 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
> [   56.849247] RSP: 002b:00007ffc7db8fde8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> [   56.853889] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fb0f1f7a984
> [   56.858482] RDX: 0000000000000001 RSI: 0000560df4e4ea65 RDI: 0000000000000004
> [   56.863154] RBP: 0000000000000004 R08: 0000560e0e417010 R09: 0000000000000007
> [   56.867794] R10: 00000000000001b6 R11: 0000000000000202 R12: 7fffffffffffffff
> [   56.872980] R13: 00007fb0f1f7a970 R14: 0000560df4e4ea65 R15: 0000560df4e71bd0
> [   56.878043]  </TASK>
> [   56.878555] ---[ end trace 0000000000000000 ]---
> [   56.883420] object pointer: 0x00000000f38e5ae7
> [   56.888235] BUG: Bad page state in process zram-generator  pfn:407d03a
> [   56.889026] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x407d03a
> [   56.889877] flags: 0x57ffffc0002000(reserved|node=1|zone=2|lastcpupid=0x1fffff)
> [   56.894915] raw: 0057ffffc0002000 ffffece0c1f40e88 ffffece0c1f40e88 0000000000000000
> [   56.895771] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [   56.896562] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> [   56.897332] Modules linked in:
> [   56.902165] CPU: 2 UID: 0 PID: 1894 Comm: zram-generator Tainted: G S      W          6.11.0-rc6+ #33
> [   56.903155] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> [   56.908082] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/21/2023
> [   56.908918] Call Trace:
> [   56.909484]  <TASK>
> [   56.914148]  dump_stack_lvl+0x5d/0x80
> [   56.914747]  bad_page.cold+0x7a/0x91
> [   56.915318]  free_unref_page+0x344/0x520
> [   56.915975]  zram_destroy_comps+0x32/0x70
> [   56.916452]  zram_reset_device+0x102/0x190
> [   56.917057]  reset_store+0xa6/0x110
> [   56.921874]  kernfs_fop_write_iter+0x141/0x1f0
> [   56.926685]  vfs_write+0x294/0x460
> [   56.931385]  ksys_write+0x6d/0xf0
> [   56.936087]  do_syscall_64+0x82/0x160
> [   56.936656]  ? __pfx_kfree_link+0x10/0x10
> [   56.937257]  ? do_sys_openat2+0x9c/0xe0
> [   56.937810]  ? __handle_mm_fault+0xb34/0xfb0
> [   56.942593]  ? syscall_exit_to_user_mode+0x10/0x220
> [   56.947362]  ? do_syscall_64+0x8e/0x160
> [   56.947974]  ? __count_memcg_events+0x77/0x130
> [   56.952762]  ? count_memcg_events.constprop.0+0x1a/0x30
> [   56.953356]  ? handle_mm_fault+0x1bb/0x2c0
> [   56.953937]  ? do_user_addr_fault+0x55a/0x7b0
> [   56.958999]  ? exc_page_fault+0x7e/0x180
> [   56.959523]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   56.960163] RIP: 0033:0x7fb0f1f7a984
> [   56.960731] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 06 0e 00 00 7
> 4 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
> [   56.966840] RSP: 002b:00007ffc7db8fde8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> [   56.971903] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fb0f1f7a984
> [   56.976953] RDX: 0000000000000001 RSI: 0000560df4e4ea65 RDI: 0000000000000004
> [   56.981946] RBP: 0000000000000004 R08: 0000560e0e417010 R09: 0000000000000007
> [   56.986980] R10: 00000000000001b6 R11: 0000000000000202 R12: 7fffffffffffffff
> [   56.991985] R13: 00007fb0f1f7a970 R14: 0000560df4e4ea65 R15: 0000560df4e71bd0
> [   56.996963]  </TASK>
> [   56.997533] Disabling lock debugging due to kernel taint
> [   57.037759] zram: Added device: zram0
> [   57.088669] zram: Added device: zram1
> [   57.249105] zram0: detected capacity change from 0 to 6553600
> [   57.320547] zram1: detected capacity change from 0 to 40960000
> [   57.443012] Adding 3276796k swap on /dev/zram0.  Priority:100 extents:1 across:3276796k SS
> [   57.470295] Adding 20479996k swap on /dev/zram1.  Priority:0 extents:1 across:20479996k SS
>
> Here is the bisect log:
>
> $ git bisect log
> # bad: [684826f8271ad97580b138b9ffd462005e470b99] zram: free secondary algorithms names
> # good: [2cacbdfdee65b18f9952620e762eab043d71b564] mm: swap: add a adaptive full cluster cache reclaim
> git bisect start 'mm-stable' 'HEAD'
> # good: [9bfbaa5e44c52422a046ce291469c8ebeb6c475d] mm/damon: move kunit tests to tests/ subdirectory with _kunit suffix
> git bisect good 9bfbaa5e44c52422a046ce291469c8ebeb6c475d
> # good: [1e673c8cf7f9c1156f615b7c00f224a8110070da] zram: add dictionary support to lz4hc
> git bisect good 1e673c8cf7f9c1156f615b7c00f224a8110070da
> # good: [3c8e44c9b369b3d422516b3f2bf47a6e3c61d1ea] mm: mark special bits for huge pfn mappings when inject
> git bisect good 3c8e44c9b369b3d422516b3f2bf47a6e3c61d1ea
> # good: [f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101] vfio/pci: implement huge_fault support
> git bisect good f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101
> # good: [659c55ef981bb63355a65ffc3b3b5cad562b806a] mm/vma: return the exact errno in vms_gather_munmap_vmas()
> git bisect good 659c55ef981bb63355a65ffc3b3b5cad562b806a
> # good: [325efb16da2c840e165d9b620fec8049d4d664cc] mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios
> git bisect good 325efb16da2c840e165d9b620fec8049d4d664cc
> # good: [ed8d5b0ce1d738e13c60d6b1a901a56d832e5070] Revert "uprobes: use vm_special_mapping close() functionality"
> git bisect good ed8d5b0ce1d738e13c60d6b1a901a56d832e5070
> # good: [2abbcc099ec60844ca7c15214ab12955d3c11e68] uprobes: turn xol_area->pages[2] into xol_area->page
> git bisect good 2abbcc099ec60844ca7c15214ab12955d3c11e68
> # first bad commit: [684826f8271ad97580b138b9ffd462005e470b99] zram: free secondary algorithms names
>
> Sergey told me there is a fix on the way:
> https://lore.kernel.org/all/20240923164843.1117010-1-andrej.skvortzov@gmail.com/

Confirm the fix in the above thread to fix the kernel oops for me.

Tested-by: Chris Li <chrisl@...nel.org>

Chris


> On Mon, Sep 16, 2024 at 6:30 PM Sergey Senozhatsky <senozhatsky@...omium.org> wrote:
>>
>> We need to kfree() secondary algorithms names when reset
>> zram device that had multi-streams, otherwise we leak memory.
>>
>> Fixes: 001d92735701 ("zram: add recompression algorithm sysfs knob")
>> Signed-off-by: Sergey Senozhatsky <senozhatsky@...omium.org>
>> ---
>>  drivers/block/zram/zram_drv.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>> index f8206ba6cbbb..c3d245617083 100644
>> --- a/drivers/block/zram/zram_drv.c
>> +++ b/drivers/block/zram/zram_drv.c
>> @@ -2115,6 +2115,11 @@ static void zram_destroy_comps(struct zram *zram)
>>                 zram->num_active_comps--;
>>         }
>>
>> +       for (prio = ZRAM_SECONDARY_COMP; prio < ZRAM_MAX_COMPS; prio++) {
>> +               kfree(zram->comp_algs[prio]);
>> +               zram->comp_algs[prio] = NULL;
>> +       }
>> +
>>         zram_comp_params_reset(zram);
>>  }
>>
>> --
>> 2.46.0.662.g92d0881bb0-goog
>>
>>