lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0fcdb6bc-68e6-639b-4710-7aaadda62ae1@linux.intel.com>
Date: Sat, 9 Mar 2024 21:59:33 -0500
From: Lei Huang <lei.huang@...ux.intel.com>
To: Miklos Szeredi <miklos@...redi.hu>,
 Bernd Schubert <bernd.schubert@...tmail.fm>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v1] fs/fuse: Fix missing FOLL_PIN for direct-io

Thank you very much, Miklos!

Yes. It is not easy to reproduce the issues in real applications. We 
only observed the issue in our own testing tool which runs multiple 
tests concurrently. We have not been able reproduce it with simple code yet.

-lei

On 3/6/24 07:05, Miklos Szeredi wrote:
> On Wed, 6 Mar 2024 at 12:16, Bernd Schubert <bernd.schubert@...tmail.fm> wrote:
>>
>>
>>
>> On 3/6/24 11:01, Miklos Szeredi wrote:
>>> On Tue, 29 Aug 2023 at 20:37, Lei Huang <lei.huang@...ux.intel.com> wrote:
>>>>
>>>> Our user space filesystem relies on fuse to provide POSIX interface.
>>>> In our test, a known string is written into a file and the content
>>>> is read back later to verify correct data returned. We observed wrong
>>>> data returned in read buffer in rare cases although correct data are
>>>> stored in our filesystem.
>>>>
>>>> Fuse kernel module calls iov_iter_get_pages2() to get the physical
>>>> pages of the user-space read buffer passed in read(). The pages are
>>>> not pinned to avoid page migration. When page migration occurs, the
>>>> consequence are two-folds.
>>>>
>>>> 1) Applications do not receive correct data in read buffer.
>>>> 2) fuse kernel writes data into a wrong place.
>>>>
>>>> Using iov_iter_extract_pages() to pin pages fixes the issue in our
>>>> test.
>>>>
>>>> An auxiliary variable "struct page **pt_pages" is used in the patch
>>>> to prepare the 2nd parameter for iov_iter_extract_pages() since
>>>> iov_iter_get_pages2() uses a different type for the 2nd parameter.
>>>>
>>>> Signed-off-by: Lei Huang <lei.huang@...ux.intel.com>
>>>
>>> Applied, with a modification to only unpin if
>>> iov_iter_extract_will_pin() returns true.
>>
>> Hi Miklos,
>>
>> do you have an idea if this needs to be back ported and to which kernel
>> version?
>> I had tried to reproduce data corruption with 4.18 - Lei wrote that he
>> could see issues with older kernels as well, but I never managed to
>> trigger anything on 4.18-RHEL. Typically I use ql-fstest
>> (https://github.com/bsbernd/ql-fstest) and even added random DIO as an
>> option - nothing report with weeks of run time. I could try again with
>> more recent kernels that have folios.
> 
> I don't think that corruption will happen in real life.  So I'm not
> sure we need to bother with backporting, and definitely not before
> when the infrastructure was introduced.
> 
> Thanks,
> Miklos

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ