lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <8C4EFDA4-A286-40C9-8F96-BD3EE07D6C45@gmail.com>
Date: Mon, 14 Oct 2024 13:08:29 +0900
From: Jeongjun Park <aha310510@...il.com>
To: Kairui Song <ryncsn@...il.com>, akpm@...ux-foundation.org
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 syzbot+fa43f1b63e3aa6f66329@...kaller.appspotmail.com
Subject: Re: [PATCH v2] mm: swap: prevent possible data-race in __try_to_reclaim_swap



> Kairui Song <ryncsn@...il.com> wrote:
> 
> On Mon, Oct 14, 2024 at 10:17 AM Jeongjun Park <aha310510@...il.com> wrote:
>>> Kairui Song <ryncsn@...il.com> wrote:
>>> 
>>> On Mon, Oct 7, 2024 at 3:06 PM Jeongjun Park <aha310510@...il.com> wrote:
>>>> 
>>>> A report [1] was uploaded from syzbot.
>>>> 
>>>> In the previous commit 862590ac3708 ("mm: swap: allow cache reclaim to skip
>>>> slot cache"), the __try_to_reclaim_swap() function reads offset and folio->entry
>>>> from folio without folio_lock protection.
>>>> 
>>>> In the currently reported KCSAN log, it is assumed that the actual data-race
>>>> will not occur because the calltrace that does WRITE already obtains the
>>>> folio_lock and then writes.
>>>> 
>>>> However, the existing __try_to_reclaim_swap() function was already implemented
>>>> to perform reads under folio_lock protection [1], and there is a risk of a
>>>> data-race occurring through a function other than the one shown in the KCSAN
>>>> log.
>>>> 
>>>> Therefore, I think it is appropriate to change read operations for
>>>> folio to be performed under folio_lock.
>>>> 
>>>> [1]
>>>> 
>>>> ==================================================================
>>>> BUG: KCSAN: data-race in __delete_from_swap_cache / __try_to_reclaim_swap
>>>> 
>>>> write to 0xffffea0004c90328 of 8 bytes by task 5186 on cpu 0:
>>>> __delete_from_swap_cache+0x1f0/0x290 mm/swap_state.c:163
>>>> delete_from_swap_cache+0x72/0xe0 mm/swap_state.c:243
>>>> folio_free_swap+0x1d8/0x1f0 mm/swapfile.c:1850
>>>> free_swap_cache mm/swap_state.c:293 [inline]
>>>> free_pages_and_swap_cache+0x1fc/0x410 mm/swap_state.c:325
>>>> __tlb_batch_free_encoded_pages mm/mmu_gather.c:136 [inline]
>>>> tlb_batch_pages_flush mm/mmu_gather.c:149 [inline]
>>>> tlb_flush_mmu_free mm/mmu_gather.c:366 [inline]
>>>> tlb_flush_mmu+0x2cf/0x440 mm/mmu_gather.c:373
>>>> zap_pte_range mm/memory.c:1700 [inline]
>>>> zap_pmd_range mm/memory.c:1739 [inline]
>>>> zap_pud_range mm/memory.c:1768 [inline]
>>>> zap_p4d_range mm/memory.c:1789 [inline]
>>>> unmap_page_range+0x1f3c/0x22d0 mm/memory.c:1810
>>>> unmap_single_vma+0x142/0x1d0 mm/memory.c:1856
>>>> unmap_vmas+0x18d/0x2b0 mm/memory.c:1900
>>>> exit_mmap+0x18a/0x690 mm/mmap.c:1864
>>>> __mmput+0x28/0x1b0 kernel/fork.c:1347
>>>> mmput+0x4c/0x60 kernel/fork.c:1369
>>>> exit_mm+0xe4/0x190 kernel/exit.c:571
>>>> do_exit+0x55e/0x17f0 kernel/exit.c:926
>>>> do_group_exit+0x102/0x150 kernel/exit.c:1088
>>>> get_signal+0xf2a/0x1070 kernel/signal.c:2917
>>>> arch_do_signal_or_restart+0x95/0x4b0 arch/x86/kernel/signal.c:337
>>>> exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>>> exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
>>>> __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>> syscall_exit_to_user_mode+0x59/0x130 kernel/entry/common.c:218
>>>> do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89
>>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> 
>>>> read to 0xffffea0004c90328 of 8 bytes by task 5189 on cpu 1:
>>>> __try_to_reclaim_swap+0x9d/0x510 mm/swapfile.c:198
>>>> free_swap_and_cache_nr+0x45d/0x8a0 mm/swapfile.c:1915
>>>> zap_pte_range mm/memory.c:1656 [inline]
>>>> zap_pmd_range mm/memory.c:1739 [inline]
>>>> zap_pud_range mm/memory.c:1768 [inline]
>>>> zap_p4d_range mm/memory.c:1789 [inline]
>>>> unmap_page_range+0xcf8/0x22d0 mm/memory.c:1810
>>>> unmap_single_vma+0x142/0x1d0 mm/memory.c:1856
>>>> unmap_vmas+0x18d/0x2b0 mm/memory.c:1900
>>>> exit_mmap+0x18a/0x690 mm/mmap.c:1864
>>>> __mmput+0x28/0x1b0 kernel/fork.c:1347
>>>> mmput+0x4c/0x60 kernel/fork.c:1369
>>>> exit_mm+0xe4/0x190 kernel/exit.c:571
>>>> do_exit+0x55e/0x17f0 kernel/exit.c:926
>>>> __do_sys_exit kernel/exit.c:1055 [inline]
>>>> __se_sys_exit kernel/exit.c:1053 [inline]
>>>> __x64_sys_exit+0x1f/0x20 kernel/exit.c:1053
>>>> x64_sys_call+0x2d46/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:61
>>>> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>> do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
>>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> 
>>>> value changed: 0x0000000000000242 -> 0x0000000000000000
>>>> 
>>>> Reported-by: syzbot+fa43f1b63e3aa6f66329@...kaller.appspotmail.com
>>>> Fixes: 862590ac3708 ("mm: swap: allow cache reclaim to skip slot cache")
>>>> Signed-off-by: Jeongjun Park <aha310510@...il.com>
>>>> ---
>>>> mm/swapfile.c | 7 ++++---
>>>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>>> 
>>>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>>>> index 0cded32414a1..eb782fcd5627 100644
>>>> --- a/mm/swapfile.c
>>>> +++ b/mm/swapfile.c
>>>> @@ -194,9 +194,6 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si,
>>>>       if (IS_ERR(folio))
>>>>               return 0;
>>>> 
>>>> -       /* offset could point to the middle of a large folio */
>>>> -       entry = folio->swap;
>>>> -       offset = swp_offset(entry);
>>>>       nr_pages = folio_nr_pages(folio);
>>>>       ret = -nr_pages;
>>>> 
>>>> @@ -210,6 +207,10 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si,
>>>>       if (!folio_trylock(folio))
>>>>               goto out;
>>>> 
>>>> +       /* offset could point to the middle of a large folio */
>>>> +       entry = folio->swap;
>>>> +       offset = swp_offset(entry);
>>>> +
>>>>       need_reclaim = ((flags & TTRS_ANYWAY) ||
>>>>                       ((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) ||
>>>>                       ((flags & TTRS_FULL) && mem_cgroup_swap_full(folio)));
>>>> --
>>> 
>>> Reviewed-by: Kairui Song <kasong@...cent.com>
>>> 
>>> Hi Andrew,
>>> 
>>> Will this be added to stable and 6.12? 862590ac3708 is already in 6.12
>>> and this fixes a potential issue of it.
>> 
>> As far as I can see, commit 862590ac3708 was applied starting
>> from 6.12-rc1, so it looks like no additional commits are needed
>> for the stable version.
> 
> Hi, sorry for the confusion, I meant mm-stable, not the stable branch.
> It's better to merge this in 6.12.

I agree with you. I think this vulnerability should be fixed quickly,
so it should be applied directly to the next rc version, not the
next tree. However, this vulnerability does not affect the stable 
version, so I think it is appropriate to move this patch to the
mm-hotfixes-unstable tree.

What do you think, Andrew?

Regards,

Jeongjun Park

> 
>> Regards,
>> 
>> Jeongjun Park

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ