lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e1903d27-ff8e-adb2-ac64-5af662b99d1f@kernel.org>
Date:   Mon, 1 Nov 2021 16:11:06 +0800
From:   Chao Yu <chao@...nel.org>
To:     Hyeong-Jun Kim <hj514.kim@...sung.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>
Cc:     linux-f2fs-devel@...ts.sourceforge.net,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] F2FS: invalidate META_MAPPING before IPU/DIO write

On 2021/11/1 15:23, Hyeong-Jun Kim wrote:
> On Mon, 2021-11-01 at 15:12 +0800, Chao Yu wrote:
>> On 2021/11/1 15:09, Hyeong-Jun Kim wrote:
>>> On Mon, 2021-11-01 at 14:28 +0800, Chao Yu wrote:
>>>> On 2021/11/1 13:42, Hyeong-Jun Kim wrote:
>>>>> Encrypted pages during GC are read and cached in META_MAPPING.
>>>>> However, due to cached pages in META_MAPPING, there is an issue
>>>>> where
>>>>> newly written pages are lost by IPU or DIO writes.
>>>>>
>>>>> Thread A                              Thread B
>>>>> - f2fs_gc(): blk 0x10 -> 0x20 (a)
>>>>>                                          - IPU or DIO write on
>>>>> blk
>>>>> 0x20 (b)
>>>>> - f2fs_gc(): blk 0x20 -> 0x30 (c)
>>>>>
>>>>> (a) page for blk 0x20 is cached in META_MAPPING and page for
>>>>> blk
>>>>> 0x10
>>>>>        is invalidated from META_MAPPING.
>>>>> (b) write new data to blk 0x200 using IPU or DIO, but outdated
>>>>> data
>>>>>        still remains in META_MAPPING.
>>>>> (c) f2fs_gc() try to move blk from 0x20 to 0x30 using cached
>>>>> page
>>>>> in
>>>>>        META_MAPPING. In conclusion, the newly written data in
>>>>> (b) is
>>>>> lost.
>>>>
>>>> In c), f2fs_gc() will readahead encrypted block from disk via
>>>> ra_data_block() anyway,
>>>> not matter cached encrypted page of meta inode is uptodate or
>>>> not, so
>>>> it's safe, right?
>>>
>>> Right,
>>> However, if DIO write is performed between phase 3 and phase 4 of
>>> f2fs_gc(),
>>> the cached page of meta_mapping will be out-dated, though it read
>>> data
>>> from
>>> disk via ra_data_block() in phase 3.
>>>
>>> What do you think?
>>
>> Due to i_gc_rwsem lock coverage, the race condition should not happen
>> right now?
>>
> - Thread A                                       - Thread B
> /* phase 3 */
> down_write(i_gc_rwsem)
> ra_data_block()
> up_write(i_gc_rwsem)
>                                                         
>   f2fs_direct_IO() :
>                                                         
>   down_read(i_gc_rwsem)
>                                                         
>   __blockdev_direct_IO()
>                                                             ...
>                                                           
>   get_ddata_block_dio_write()
>                                                             ...
>                                                           
>   f2fs_dio_submit_bio()
>                                                         
>   up_read(i_gc_rwsem)
> /* phase 4 */
> down_write(i_gc_rwsem)
> move_data_block()
> up_write(i_gc_rwsem)
> 
> It looks, i_gc_rwsem could not protect page update between phase 3 and
> 4.
> 
> Am I missing anything?

It looks you're right, there is a hole in between readahead and movepage functions...

Could you please update the race condition description? and add a tag as below to fix
stable kernel as well:

Fixes: 6aa58d8ad20a ("f2fs: readahead encrypted block during GC")

Thanks,

> 
> Thanks
> 
>> Thanks,
>>
>>> Thanks,
>>>> Am I missing anything?
>>>>
>>>> Thanks,
>>>>
>>>>> To address this issue, invalidating pages in META_MAPPING
>>>>> before
>>>>> IPU or
>>>>> DIO write.
>>>>>
>>>>> Signed-off-by: Hyeong-Jun Kim <
>>>>> hj514.kim@...sung.com
>>>>>
>>>>>
>>>>> ---
>>>>>     fs/f2fs/data.c    | 2 ++
>>>>>     fs/f2fs/segment.c | 3 +++
>>>>>     2 files changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>>> index 74e1a350c1d8..9f754aaef558 100644
>>>>> --- a/fs/f2fs/data.c
>>>>> +++ b/fs/f2fs/data.c
>>>>> @@ -1708,6 +1708,8 @@ int f2fs_map_blocks(struct inode *inode,
>>>>> struct f2fs_map_blocks *map,
>>>>>     		 */
>>>>>     		f2fs_wait_on_block_writeback_range(inode,
>>>>>     						map->m_pblk,
>>>>> map-
>>>>>> m_len);
>>>>>
>>>>> +		invalidate_mapping_pages(META_MAPPING(sbi),
>>>>> +						map->m_pblk,
>>>>> map-
>>>>>> m_pblk);
>>>>>
>>>>>     
>>>>>     		if (map->m_multidev_dio) {
>>>>>     			block_t blk_addr = map->m_pblk;
>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>>>>> index 526423fe84ce..f57c55190f9e 100644
>>>>> --- a/fs/f2fs/segment.c
>>>>> +++ b/fs/f2fs/segment.c
>>>>> @@ -3652,6 +3652,9 @@ int f2fs_inplace_write_data(struct
>>>>> f2fs_io_info *fio)
>>>>>     		goto drop_bio;
>>>>>     	}
>>>>>     
>>>>> +	invalidate_mapping_pages(META_MAPPING(fio->sbi),
>>>>> +				fio->new_blkaddr, fio-
>>>>>> new_blkaddr);
>>>>> +
>>>>>     	stat_inc_inplace_blocks(fio->sbi);
>>>>>     
>>>>>     	if (fio->bio && !(SM_I(sbi)->ipu_policy & (1 <<
>>>>> F2FS_IPU_NOCACHE)))
>>>>>
>>>>
>>>>
>>
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ