linux-kernel - Re: [PATCH 2/5] swap: fix do_swap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2268e7f9-cfdf-92f3-9a32-04e103e132fc@huawei.com>
Date:   Mon, 12 Apr 2021 11:24:58 +0800
From:   Miaohe Lin <linmiaohe@...wei.com>
To:     "Huang, Ying" <ying.huang@...el.com>
CC:     Tim Chen <tim.c.chen@...ux.intel.com>, <akpm@...ux-foundation.org>,
        <hannes@...xchg.org>, <mhocko@...e.com>, <iamjoonsoo.kim@....com>,
        <vbabka@...e.cz>, <alex.shi@...ux.alibaba.com>,
        <willy@...radead.org>, <minchan@...nel.org>,
        <richard.weiyang@...il.com>, <hughd@...gle.com>,
        <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>
Subject: Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff

On 2021/4/12 9:44, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@...wei.com> writes:
> 
>> On 2021/4/10 1:17, Tim Chen wrote:
>>>
>>>
>>> On 4/9/21 1:42 AM, Miaohe Lin wrote:
>>>> On 2021/4/9 5:34, Tim Chen wrote:
>>>>>
>>>>>
>>>>> On 4/8/21 6:08 AM, Miaohe Lin wrote:
>>>>>> When I was investigating the swap code, I found the below possible race
>>>>>> window:
>>>>>>
>>>>>> CPU 1					CPU 2
>>>>>> -----					-----
>>>>>> do_swap_page
>>>>>>   synchronous swap_readpage
>>>>>>     alloc_page_vma
>>>>>> 					swapoff
>>>>>> 					  release swap_file, bdev, or ...
>>>>>
>>>>
>>>> Many thanks for quick review and reply!
>>>>
>>>>> Perhaps I'm missing something.  The release of swap_file, bdev etc
>>>>> happens after we have cleared the SWP_VALID bit in si->flags in destroy_swap_extents
>>>>> if I read the swapoff code correctly.
>>>> Agree. Let's look this more close:
>>>> CPU1								CPU2
>>>> -----								-----
>>>> swap_readpage
>>>>   if (data_race(sis->flags & SWP_FS_OPS)) {
>>>> 								swapoff
>>>> 								  p->swap_file = NULL;
>>>>     struct file *swap_file = sis->swap_file;
>>>>     struct address_space *mapping = swap_file->f_mapping;[oops!]
>>>> 								  ...
>>>> 								  p->flags = 0;
>>>>     ...
>>>>
>>>> Does this make sense for you?
>>>
>>> p->swapfile = NULL happens after the 
>>> p->flags &= ~SWP_VALID, synchronize_rcu(), destroy_swap_extents() sequence in swapoff().
>>>
>>> So I don't think the sequence you illustrated on CPU2 is in the right order.
>>> That said, without get_swap_device/put_swap_device in swap_readpage, you could
>>> potentially blow pass synchronize_rcu() on CPU2 and causes a problem.  so I think
>>> the problematic race looks something like the following:
>>>
>>>
>>> CPU1								CPU2
>>> -----								-----
>>> swap_readpage
>>>   if (data_race(sis->flags & SWP_FS_OPS)) {
>>> 								swapoff
>>> 								  p->flags = &= ~SWP_VALID;
>>> 								  ..
>>> 								  synchronize_rcu();
>>> 								  ..
>>> 								  p->swap_file = NULL;
>>>     struct file *swap_file = sis->swap_file;
>>>     struct address_space *mapping = swap_file->f_mapping;[oops!]
>>> 								  ...
>>>     ...
>>>
>>
>> Agree. This is also what I meant to illustrate. And you provide a better one. Many thanks!
> 
> For the pages that are swapped in through swap cache.  That isn't an
> issue.  Because the page is locked, the swap entry will be marked with
> SWAP_HAS_CACHE, so swapoff() cannot proceed until the page has been
> unlocked.
> 
> So the race is for the fast path as follows,
> 
> 		if (data_race(si->flags & SWP_SYNCHRONOUS_IO) &&
> 		    __swap_count(entry) == 1)
> 
> I found it in your original patch description.  But please make it more
> explicit to reduce the potential confusing.

Sure. Should I rephrase the commit log to clarify this or add a comment in the code?

Thanks.

> 
> Best Regards,
> Huang, Ying
> .
>