lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aKA7180s0HdLfOKc@harry>
Date: Sat, 16 Aug 2025 17:05:43 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Sudarsan Mahendran <sudarsanm@...gle.com>
Cc: vbabka@...e.cz, Liam.Howlett@...cle.com, cl@...two.org, howlett@...il.com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        maple-tree@...ts.infradead.org, rcu@...r.kernel.org,
        rientjes@...gle.com, roman.gushchin@...ux.dev, surenb@...gle.com,
        urezki@...il.com
Subject: Re: [PATCH v5 00/14] SLUB percpu sheaves

On Fri, Aug 15, 2025 at 03:53:00PM -0700, Sudarsan Mahendran wrote:
> Hi Vlastimil,
> 
> I ported this patch series on top of v6.17.
> I had to resolve some merge conflicts because of 
> fba46a5d83ca8decb338722fb4899026d8d9ead2
> 
> The conflict resolution looks like:
> 
> @@ -5524,20 +5335,19 @@ EXPORT_SYMBOL_GPL(mas_store_prealloc);
>  int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
>  {
>         MA_WR_STATE(wr_mas, mas, entry);
> -       int ret = 0;
> -       int request;
> 
>         mas_wr_prealloc_setup(&wr_mas);
>         mas->store_type = mas_wr_store_type(&wr_mas);
> -       request = mas_prealloc_calc(&wr_mas, entry);
> -       if (!request)
> +       mas_prealloc_calc(&wr_mas, entry);
> +       if (!mas->node_request)
>                 goto set_flag;
> 
>         mas->mas_flags &= ~MA_STATE_PREALLOC;
> -       mas_node_count_gfp(mas, request, gfp);
> +       mas_alloc_nodes(mas, gfp);
>         if (mas_is_err(mas)) {
> -               mas_set_alloc_req(mas, 0);
> -               ret = xa_err(mas->node);
> +               int ret = xa_err(mas->node);
> +
> +               mas->node_request = 0;
>                 mas_destroy(mas);
>                 mas_reset(mas);
>                 return ret;
> @@ -5545,7 +5355,7 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
> 
>  set_flag:
>         mas->mas_flags |= MA_STATE_PREALLOC;
> -       return ret;
> +       return 0;
>  }
>  EXPORT_SYMBOL_GPL(mas_preallocate);
> 
> 
> 
> When I try to boot this kernel, I see kernel panic
> with rcu_free_sheaf() doing recursion into __kmem_cache_free_bulk()
> 
> Stack trace:
> 
> [    1.583673] Oops: stack guard page: 0000 [#1] SMP NOPTI
> [    1.583676] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.17.0-smp-sheaves2 #1 NONE
> [    1.583679] RIP: 0010:__kmem_cache_free_bulk+0x57/0x540
> [    1.583684] Code: 48 85 f6 0f 84 b8 04 00 00 49 89 d6 49 89 ff 48 85 ff 0f 84 fe 03 00 00 49 83 7f 08 00 0f 84 f3 03 00 00 0f 1f 44 00 00 31 c0 <48> 89 44 24 18 65 8b 05 6d 26 dc 02 89 44 24 2c 31 ff 89 f8 c7 44
> [    1.583685] RSP: 0018:ff40dbc49b048fc0 EFLAGS: 00010246
> [    1.583687] RAX: 0000000000000000 RBX: 0000000000000012 RCX: ffffffff939e8640
> [    1.583687] RDX: ff2afe75213e6c90 RSI: 0000000000000012 RDI: ff2afe750004ad00
> [    1.583688] RBP: ff40dbc49b049130 R08: ff2afe75368c2500 R09: ff2afe75368c3b00
> [    1.583689] R10: ff2afe75368c2500 R11: ff2afe75368c3b00 R12: ff2aff31ba00b000
> [    1.583690] R13: ffffffff939e8640 R14: ff2afe75213e6c90 R15: ff2afe750004ad00
> [    1.583690] FS:  0000000000000000(0000) GS:ff2aff31ba00b000(0000) knlGS:0000000000000000
> [    1.583691] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.583692] CR2: ff40dbc49b048fb8 CR3: 0000000017c3e001 CR4: 0000000000771ef0
> [    1.583692] PKRU: 55555554
> [    1.583693] Call Trace:
> [    1.583694]  <IRQ>
> [    1.583696]  __kmem_cache_free_bulk+0x2c7/0x540

[..]

> [    1.583759]  __kmem_cache_free_bulk+0x2c7/0x540

Hi Sudarsan, thanks for the report.

I'm not really sure how __kmem_cache_free_bulk() can call itself.
There's no recursion of __kmem_cache_free_bulk() in the code.

As v6.17-rc1 is known to cause a few surprising bugs, could you please
rebase onto of mm-hotfixes-unstable and check if it still reproduces?

> [    1.583761]  ? update_group_capacity+0xad/0x1f0
> [    1.583763]  ? sched_balance_rq+0x4f6/0x1e80
> [    1.583765]  __kmem_cache_free_bulk+0x2c7/0x540
> [    1.583767]  ? update_irq_load_avg+0x35/0x480
> [    1.583768]  ? __pfx_rcu_free_sheaf+0x10/0x10
> [    1.583769]  rcu_free_sheaf+0x86/0x110
> [    1.583771]  rcu_do_batch+0x245/0x750
> [    1.583772]  rcu_core+0x13a/0x260
> [    1.583773]  handle_softirqs+0xcb/0x270
> [    1.583775]  __irq_exit_rcu+0x48/0xf0
> [    1.583776]  sysvec_apic_timer_interrupt+0x74/0x80
> [    1.583778]  </IRQ>
> [    1.583778]  <TASK>
> [    1.583779]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [    1.583780] RIP: 0010:cpuidle_enter_state+0x101/0x290
> [    1.583781] Code: 85 f4 ff ff 49 89 c4 8b 73 04 bf ff ff ff ff e8 d5 44 d4 ff 31 ff e8 9e c7 37 ff 80 7c 24 04 00 74 05 e8 12 45 d4 ff fb 85 ed <0f> 88 ba 00 00 00 89 e9 48 6b f9 68 4c 8b 44 24 08 49 8b 54 38 30
> [    1.583782] RSP: 0018:ff40dbc4809afe80 EFLAGS: 00000202
> [    1.583782] RAX: ff2aff31ba00b000 RBX: ff2afe75614b0800 RCX: 000000005e64b52b
> [    1.583783] RDX: 000000005e73f761 RSI: 0000000000000067 RDI: 0000000000000000
> [    1.583783] RBP: 0000000000000002 R08: fffffffffffffff6 R09: 0000000000000000
> [    1.583784] R10: 0000000000000380 R11: ffffffff908c38d0 R12: 000000005e64b535
> [    1.583784] R13: 000000005e5580da R14: ffffffff92890b10 R15: 0000000000000002
> [    1.583784]  ? __pfx_read_tsc+0x10/0x10
> [    1.583787]  cpuidle_enter+0x2c/0x40
> [    1.583788]  do_idle+0x1a7/0x240
> [    1.583790]  cpu_startup_entry+0x2a/0x30
> [    1.583791]  start_secondary+0x95/0xa0
> [    1.583794]  common_startup_64+0x13e/0x140
> [    1.583796]  </TASK>
> [    1.583796] Modules linked in:
> [    1.583798] ---[ end trace 0000000000000000 ]---
> [    1.583798] RIP: 0010:__kmem_cache_free_bulk+0x57/0x540
> [    1.583800] Code: 48 85 f6 0f 84 b8 04 00 00 49 89 d6 49 89 ff 48 85 ff 0f 84 fe 03 00 00 49 83 7f 08 00 0f 84 f3 03 00 00 0f 1f 44 00 00 31 c0 <48> 89 44 24 18 65 8b 05 6d 26 dc 02 89 44 24 2c 31 ff 89 f8 c7 44
> [    1.583800] RSP: 0018:ff40dbc49b048fc0 EFLAGS: 00010246
> [    1.583801] RAX: 0000000000000000 RBX: 0000000000000012 RCX: ffffffff939e8640
> [    1.583801] RDX: ff2afe75213e6c90 RSI: 0000000000000012 RDI: ff2afe750004ad00
> [    1.583801] RBP: ff40dbc49b049130 R08: ff2afe75368c2500 R09: ff2afe75368c3b00
> [    1.583802] R10: ff2afe75368c2500 R11: ff2afe75368c3b00 R12: ff2aff31ba00b000
> [    1.583802] R13: ffffffff939e8640 R14: ff2afe75213e6c90 R15: ff2afe750004ad00
> [    1.583802] FS:  0000000000000000(0000) GS:ff2aff31ba00b000(0000) knlGS:0000000000000000
> [    1.583803] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.583803] CR2: ff40dbc49b048fb8 CR3: 0000000017c3e001 CR4: 0000000000771ef0
> [    1.583803] PKRU: 55555554
> [    1.583804] Kernel panic - not syncing: Fatal exception in interrupt
> [    1.584659] Kernel Offset: 0xf600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ