linux-kernel - Re: [PATCH] md/raid1: Fix use-after-free in reshape pool wait queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13a82dab-94c9-4616-90ff-17a8aa7bff81@gmail.com>
Date: Tue, 10 Jun 2025 12:51:04 +0800
From: Wang Jinchao <wangjinchao600@...il.com>
To: Yu Kuai <yukuai1@...weicloud.com>, Song Liu <song@...nel.org>
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH] md/raid1: Fix use-after-free in reshape pool wait queue


在 2025/6/10 10:52, Yu Kuai 写道:
> Hi,
>
> 在 2025/06/09 20:01, Wang Jinchao 写道:
>> During raid1 reshape operations, a use-after-free can occur in the 
>> mempool
>> wait queue when r1bio_pool->curr_nr drops below min_nr. This happens
>> because:
>
> Can you attach have the uaf log?
[  921.784898] [      C2] BUG: kernel NULL pointer dereference, address: 
0000000000000002
[  921.784907] [      C2] #PF: supervisor instruction fetch in kernel mode
[  921.784910] [      C2] #PF: error_code(0x0010) - not-present page
[  921.784912] [      C2] PGD 0 P4D 0
[  921.784915] [      C2] Oops: 0010 [#1] PREEMPT SMP NOPTI
[  921.784919] [      C2] CPU: 2 PID: 1659 Comm: zds Kdump: loaded 
Tainted: G     U  W   E      6.8.1-debug-0519 #49
[  921.784922] [      C2] Hardware name: Default string Default 
string/Default string, BIOS DNS9V011 12/24/2024
[  921.784923] [      C2] RIP: 0010:0x2
[  921.784929] [      C2] Code: Unable to access opcode bytes at 
0xffffffffffffffd8.
[  921.784931] [      C2] RSP: 0000:ffffa3fac0220c70 EFLAGS: 00010087
[  921.784933] [      C2] RAX: 0000000000000002 RBX: ffff8890539070d8 
RCX: 0000000000000000
[  921.784935] [      C2] RDX: 0000000000000000 RSI: 0000000000000003 
RDI: ffffa3fac07dfc90
[  921.784936] [      C2] RBP: ffffa3fac0220ca8 R08: 2557c7cc905cff00 
R09: 0000000000000000
[  921.784938] [      C2] R10: 0000000000000000 R11: 0000000000000000 
R12: 000000008fa158a0
[  921.784939] [      C2] R13: 2557c7cc905cfee8 R14: 0000000000000000 
R15: 0000000000000000
[  921.784941] [      C2] FS:  00007d8b034006c0(0000) 
GS:ffff8891bf900000(0000) knlGS:0000000000000000
[  921.784943] [      C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  921.784945] [      C2] CR2: ffffffffffffffd8 CR3: 00000001097be000 
CR4: 0000000000f50ef0
[  921.784946] [      C2] PKRU: 55555554
[  921.784948] [      C2] Call Trace:
[  921.784949] [      C2]  <IRQ>
[  921.784950] [      C2]  ? show_regs+0x6d/0x80
[  921.784957] [      C2]  ? __die+0x24/0x80
[  921.784960] [      C2]  ? page_fault_oops+0x156/0x4b0
[  921.784964] [      C2]  ? mempool_free_slab+0x17/0x30
[  921.784968] [      C2]  ? __slab_free+0x15d/0x2e0
[  921.784971] [      C2]  ? do_user_addr_fault+0x2ee/0x6b0
[  921.784975] [      C2]  ? exc_page_fault+0x83/0x1b0
[  921.784979] [      C2]  ? asm_exc_page_fault+0x27/0x30
[  921.784984] [      C2]  ? __wake_up_common+0x76/0xb0
[  921.784987] [      C2]  __wake_up+0x37/0x70
[  921.784990] [      C2]  mempool_free+0xaa/0xc0
[  921.784993] [      C2]  raid_end_bio_io+0x97/0x130 [raid1]
[  921.784999] [      C2]  r1_bio_write_done+0x43/0x60 [raid1]
[  921.785005] [      C2]  raid1_end_write_request+0x10e/0x390 [raid1]
[  921.785011] [      C2]  ? scsi_done_internal+0x6f/0xd0
[  921.785014] [      C2]  ? scsi_done+0x10/0x20
[  921.785016] [      C2]  bio_endio+0xeb/0x180
[  921.785021] [      C2]  blk_update_request+0x175/0x540
[  921.785024] [      C2]  ? ata_qc_complete+0xc9/0x340
[  921.785027] [      C2]  scsi_end_request+0x2c/0x1c0
[  921.785029] [      C2]  scsi_io_completion+0x5a/0x700
[  921.785031] [      C2]  scsi_finish_command+0xc3/0x110
[  921.785033] [      C2]  scsi_complete+0x7a/0x180
[  921.785035] [      C2]  blk_complete_reqs+0x41/0x60
[  921.785038] [      C2]  blk_done_softirq+0x1d/0x30
[  921.785041] [      C2]  __do_softirq+0xde/0x35b
[  921.785045] [      C2]  __irq_exit_rcu+0xd7/0x100
[  921.785048] [      C2]  irq_exit_rcu+0xe/0x20
[  921.785050] [      C2]  common_interrupt+0xa4/0xb0
[  921.785052] [      C2]  </IRQ>
[  921.785053] [      C2]  <TASK>
[  921.785055] [      C2]  asm_common_interrupt+0x27/0x40
[  921.785058] [      C2] RIP: 
0010:vma_interval_tree_subtree_search+0x16/0x80
[  921.785061] [      C2] Code: cc cc cc cc 90 90 90 90 90 90 90 90 90 
90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 8b 47 50 48 89 e5 48 85 c0 74 
06 48 39 70 18 <73> 30 48 8b 8f 80 00 00 00 48 39 ca 72 33 48 8b 47 08 
48 2b 07 48
[  921.785064] [      C2] RSP: 0000:ffffa3fac193b200 EFLAGS: 00000206
[  921.785066] [      C2] RAX: ffff889127715368 RBX: ffff889103bde488 
RCX: 0000000000000000
[  921.785068] [      C2] RDX: 00000000000002a8 RSI: 00000000000002a8 
RDI: ffff88910a4d70e8
[  921.785069] [      C2] RBP: ffffa3fac193b200 R08: 00000000000002a8 
R09: 0000000000000000
[  921.785071] [      C2] R10: 0000000000000000 R11: 0000000000000000 
R12: ffffa3fac193b288
[  921.785072] [      C2] R13: 0000799e22def000 R14: ffffde4f0455d240 
R15: 00000000000002a8
[  921.785075] [      C2]  vma_interval_tree_iter_next+0x98/0xb0
[  921.785077] [      C2]  rmap_walk_file+0xe4/0x1a0
[  921.785081] [      C2]  folio_referenced+0x14c/0x1f0
[  921.785083] [      C2]  ? __pfx_folio_referenced_one+0x10/0x10
[  921.785086] [      C2]  ? __pfx_folio_lock_anon_vma_read+0x10/0x10
[  921.785089] [      C2]  ? __pfx_invalid_folio_referenced_vma+0x10/0x10
[  921.785092] [      C2]  shrink_folio_list+0x7c4/0xc80
[  921.785095] [      C2]  evict_folios+0x32c/0x9f0
[  921.785098] [      C2]  ? xa_load+0x86/0xf0
[  921.785103] [      C2]  try_to_shrink_lruvec+0x1f3/0x390
[  921.785105] [      C2]  ? get_swappiness+0x56/0x90
[  921.785109] [      C2]  shrink_one+0x123/0x1e0
[  921.785111] [      C2]  shrink_node+0x9f1/0xcb0
[  921.785114] [      C2]  ? vmpressure+0x37/0x180
[  921.785118] [      C2]  do_try_to_free_pages+0xdf/0x600
[  921.785121] [      C2]  ? throttle_direct_reclaim+0x105/0x280
[  921.785124] [      C2]  try_to_free_pages+0xe6/0x210
[  921.785128] [      C2]  __alloc_pages+0x70c/0x1360
[  921.785131] [      C2]  ? __do_fault+0x38/0x140
[  921.785133] [      C2]  ? do_fault+0x295/0x4c0
[  921.785135] [      C2]  ? do_user_addr_fault+0x169/0x6b0
[  921.785138] [      C2]  ? exc_page_fault+0x83/0x1b0
[  921.785141] [      C2]  ? stack_depot_save+0xe/0x20
[  921.785145] [      C2]  ? set_track_prepare+0x48/0x70
[  921.785148] [      C2]  ? squashfs_page_actor_init_special+0x16e/0x190
[  921.785152] [      C2]  ? squashfs_readahead+0x562/0x8b0
[  921.785155] [      C2]  ? read_pages+0x6a/0x270
[  921.785158] [      C2]  ? policy_nodemask+0xe1/0x150
[  921.785161] [      C2]  alloc_pages_mpol+0x91/0x210
[  921.785164] [      C2]  ? __kmalloc+0x448/0x4e0
[  921.785166] [      C2]  alloc_pages+0x5b/0xd0
[  921.785169] [      C2]  squashfs_bio_read+0x1b4/0x5e0
[  921.785172] [      C2]  squashfs_read_data+0xba/0x650
[  921.785175] [      C2]  squashfs_readahead+0x599/0x8b0
[  921.785178] [      C2]  ? __lruvec_stat_mod_folio+0x70/0xc0
[  921.785180] [      C2]  read_pages+0x6a/0x270
[  921.785184] [      C2]  page_cache_ra_unbounded+0x167/0x1c0
[  921.785187] [      C2]  page_cache_ra_order+0x2cf/0x350
[  921.785191] [      C2]  filemap_fault+0x5f9/0xc10
[  921.785195] [      C2]  __do_fault+0x38/0x140
[  921.785197] [      C2]  do_fault+0x295/0x4c0
[  921.785199] [      C2]  __handle_mm_fault+0x896/0xee0
[  921.785202] [      C2]  ? __fdget+0xc7/0xf0
[  921.785206] [      C2]  handle_mm_fault+0x18a/0x380
[  921.785209] [      C2]  do_user_addr_fault+0x169/0x6b0
[  921.785212] [      C2]  exc_page_fault+0x83/0x1b0
[  921.785215] [      C2]  asm_exc_page_fault+0x27/0x30
[  921.785218] [      C2] RIP: 0033:0x7d8b0391ad50
[  921.785221] [      C2] Code: Unable to access opcode bytes at 
0x7d8b0391ad26.
[  921.785223] [      C2] RSP: 002b:00007d8b033ff9f8 EFLAGS: 00010293
[  921.785225] [      C2] RAX: 000000000000000d RBX: 00007d8b033ffab8 
RCX: 000000000000000e
[  921.785226] [      C2] RDX: 00007d8b033ffa56 RSI: 00007d8b033ffa49 
RDI: 00007d8b033ffac8
[  921.785228] [      C2] RBP: 00007d8b033ffa49 R08: 00007d8b033ffab8 
R09: 0000000000000000
[  921.785229] [      C2] R10: 0000000000000000 R11: 0000000000000000 
R12: 00007d8b033ffa56
[  921.785230] [      C2] R13: 0000000000001388 R14: 0000000000000016 
R15: 00006306da6e18d0
[  921.785233] [      C2]  </TASK>
[  921.785234] [      C2] Modules linked in: raid1(E) xt_geoip(E) 
bcache(E) binfmt_misc(E) cfg80211(E) sch_fq_codel(E) nf_tables(E) 
zfuse(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) 
nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) autofs4(E) 
nls_iso8859_1(E) btrfs(E) blake2b_generic(E) xor(E) raid6_pq(E) 
libcrc32c(E) input_leds(E) joydev(E) hid_generic(E) xe(E) 
snd_hda_codec_hdmi(E) drm_ttm_helper(E) gpu_sched(E) 
drm_suballoc_helper(E) drm_gpuvm(E) drm_exec(E) snd_sof_pci_intel_tgl(E) 
snd_sof_intel_hda_common(E) snd_soc_hdac_hda(E) snd_sof_pci(E) 
snd_sof_xtensa_dsp(E) snd_sof_intel_hda(E) snd_sof(E) snd_sof_utils(E) 
snd_soc_acpi_intel_match(E) x86_pkg_temp_thermal(E) snd_soc_acpi(E) 
intel_powerclamp(E) coretemp(E) snd_soc_core(E) snd_compress(E) 
snd_sof_intel_hda_mlink(E) snd_hda_ext_core(E) kvm_intel(E) 
snd_hda_intel(E) mmc_block(E) snd_intel_dspcfg(E) kvm(E) irqbypass(E) 
crct10dif_pclmul(E) crc32_pclmul(E) polyval_clmulni(E) 
polyval_generic(E) ghash_clmulni_intel(E) snd_hda_codec(E) snd_hwdep(E)
[  921.785278] [      C2]  sha256_ssse3(E) snd_hda_core(E) mei_hdcp(E) 
i915(E) sha1_ssse3(E) mei_pxp(E) processor_thermal_device_pci(E) 
snd_pcm(E) processor_thermal_device(E) i2c_algo_bit(E) ee1004(E) rapl(E) 
snd_timer(E) nvme(E) sdhci_pci(E) cqhci(E) drm_buddy(E) cmdlinepart(E) 
ttm(E) spi_nor(E) sdhci(E) mei_me(E) processor_thermal_wt_hint(E) 
drm_display_helper(E) nvme_core(E) snd(E) mtd(E) intel_rapl_msr(E) 
intel_cstate(E) efi_pstore(E) nvme_auth(E) soundcore(E) xhci_pci(E) 
xhci_pci_renesas(E) processor_thermal_rfim(E) processor_thermal_rapl(E) 
i2c_i801(E) mei(E) spi_intel_pci(E) spi_intel(E) ahci(E) r8125(E) 
intel_rapl_common(E) processor_thermal_wt_req(E) libahci(E) i2c_smbus(E) 
mmc_core(E) processor_thermal_power_floor(E) cec(E) rc_core(E) 
processor_thermal_mbox(E) int340x_thermal_zone(E) wmi_bmof(E) video(E) 
intel_pmc_core(E) int3400_thermal(E) intel_hid(E) intel_vsec(E) 
pmt_telemetry(E) wmi(E) pmt_class(E) acpi_thermal_rel(E) 
sparse_keymap(E) acpi_pad(E) acpi_tad(E) mac_hid(E) aesni_intel(E) 
crypto_simd(E) cryptd(E)
[  921.785326] [      C2] CR2: 0000000000000002
>>
>> 1. mempool_init() initializes wait queue head on stack
>> 2. The stack-allocated wait queue is copied to conf->r1bio_pool through
>>     structure assignment
>> 3. wake_up() on this invalid wait queue causes panic when accessing the
>>     stack memory that no longer exists
>
> The list_head inside wait_queue_head?
newpool.wait.head
>>
>> Fix this by properly reinitializing the mempool's wait queue using
>> init_waitqueue_head(), ensuring the wait queue structure remains valid
>> throughout the reshape operation.
>>
>> Signed-off-by: Wang Jinchao <wangjinchao600@...il.com>
>> ---
>>   drivers/md/raid1.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>> index 19c5a0ce5a40..fd4ce2a4136f 100644
>> --- a/drivers/md/raid1.c
>> +++ b/drivers/md/raid1.c
>> @@ -3428,6 +3428,7 @@ static int raid1_reshape(struct mddev *mddev)
>>       /* ok, everything is stopped */
>>       oldpool = conf->r1bio_pool;
>>       conf->r1bio_pool = newpool;
>> +    init_waitqueue_head(&conf->r1bio_pool.wait);
>
> I think the real problem here is the above assignment,it's better to
> fix that instead of reinitializing the list.
>
I believe you’re aware of the issue. But just to be sure, I' ll restate it:

1) mempool_t newpool, oldpool; — newpool is allocated on the stack
2) mempool_init(&newpool, ...) — this sets newpool.wait.head to a stack 
address
3) The stack is later freed or reused by another function
4) mempool_free() calls wake_up(&pool->wait), which tries to wake up a 
wait queue using an invalid address → kernel panic

This fix is simple enough.
Alternatively, we could initialize conf->r1bio_pool directly, but that 
would also require
handling rollback in case the initialization fails.
What would you suggest?
> Thanks,
> Kuai
>
>>         for (d = d2 = 0; d < conf->raid_disks; d++) {
>>           struct md_rdev *rdev = conf->mirrors[d].rdev;
>>
>