lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Aug 2023 09:59:23 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Saurabh Singh Sengar <ssengar@...rosoft.com>,
        Zach O'Keefe <zokeefe@...gle.com>
Cc:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Yang Shi <shy828301@...il.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Greg KH <gregkh@...uxfoundation.org>,
        Saurabh Sengar <ssengar@...ux.microsoft.com>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [EXTERNAL] Re: [PATCH v3] mm/thp: fix "mm: thp: kill
 __transhuge_page_enabled()"

On 24.08.23 17:39, Saurabh Singh Sengar wrote:
>> On Thu, Aug 24, 2023 at 7:05 AM David Hildenbrand <david@...hat.com>
>> wrote:
>>>
>>> On 24.08.23 15:59, Zach O'Keefe wrote:
>>>> On Thu, Aug 24, 2023 at 12:39 AM David Hildenbrand
>> <david@...hat.com> wrote:
>>>>>
>>>>> On 22.08.23 01:48, Zach O'Keefe wrote:
>>>>>> The 6.0 commits:
>>>>>>
>>>>>> commit 9fec51689ff6 ("mm: thp: kill
>>>>>> transparent_hugepage_active()") commit 7da4e2cb8b1f ("mm: thp:
>>>>>> kill __transhuge_page_enabled()")
>>>>>>
>>>>>> merged "can we have THPs in this VMA?" logic that was previously
>>>>>> done separately by fault-path, khugepaged, and smaps "THPeligible"
>> checks.
>>>>>>
>>>>>> During the process, the semantics of the fault path check changed
>>>>>> in two
>>>>>> ways:
>>>>>>
>>>>>> 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps
>> path).
>>>>>> 2) We no longer checked if non-anonymous memory had a vm_ops-
>>> huge_fault
>>>>>>       handler that could satisfy the fault.  Previously, this check had been
>>>>>>       done in create_huge_pud() and create_huge_pmd() routines, but
>> after
>>>>>>       the changes, we never reach those routines.
>>>>>>
>>>>>> During the review of the above commits, it was determined that
>>>>>> in-tree users weren't affected by the change; most notably, since
>>>>>> the only relevant user (in terms of THP) of VM_MIXEDMAP or
>>>>>> ->huge_fault is DAX, which is explicitly approved early in
>>>>>> approval logic.  However, there is at least one occurrence where
>>>>>> an out-of-tree driver that used VM_HUGEPAGE|VM_MIXEDMAP with a
>> vm_ops->huge_fault handler, was broken.
>>>>>
>>>>> ... so all we did is break an arbitrary out-of-tree driver? Sorry
>>>>> to say, but why should we care?
>>>>>
>>>>> Is there any in-tree code affected and needs a "Fixes:" ?
>>>>
>>>> The in-tree code was taken care of during the rework .. but I didn't
>>>> know about the possibility of a driver hooking in here.
>>>
>>> And that's the problem of the driver, no? It's not the job of the
>>> kernel developers to be aware of each out-of-tree driver to not
>>> accidentally break something in there.
>>>
>>>>
>>>> I don't know what the normal policy / stance here is, but I figured
>>>> the change was simple enough that it was worth helping out.
>>>
>>> If you decide to be out-of-tree, then you have be prepared to only
>>> support tested kernels and fix your driver when something changes
>>> upstream -- like upstream developers would do for you when it would be
>>> in-tree.
>>>
>>> So why can't the out-of-tree driver be fixed, similarly to how we
>>> would have fixed it if it would be in-tree?
>>
>> I don't know much about driver development, but perhaps they are / need to
>> use a pristine upstream kernel, with their driver as a loadable kernel module.
>> Saurabh can comment on this, I don't know.
> 

Hi!

> You are correct Zach. The "out-of-tree" driver had been seamlessly operational
> on version 5.19, leveraging the kernel's capability to handle huge faults for
> non-anonymous vma. However, the transition to kernel version 6.1 inadvertently
> led to the removal of this feature. It's important to note that this removal wasn't
> intentional, and I am optimistic about the potential restoration of this feature.

We are currently creating 6.5, and are being told that a patch in 6.0 
(released almost one year ago!) broke an out-of-tree driver.

Being that back-level, you cannot possibly expect that the upstream 
community can seriously care about not breaking your OOT driver in each 
release.

Especially, we do have bigger ->huge_fault changes coming up:

https://lkml.kernel.org/r/20230818202335.2739663-1-willy@infradead.org

If the driver is not in the tree, people don't care.

You really should try upstreaming that driver.


So this patch here adds complexity (which I don't like) in order to keep 
an OOT driver working -- possibly for a short time. I'm tempted to say 
"please fix your driver to not use huge faults in that scenario, it is 
no longer supported".

But I'm just about to vanish for 1.5 week into vacation :)

@Willy, what are your thoughts?

In any case, I think we should drop the "Fixes" tag. This does not fix 
any kernel BUG -- it restores compatibility with an OOT driver -- and 
already confused people building distributions and asking about this fix ;)


> 
> Hello David,
> 
> Given the context, I am currently exploring potential ways to address the issue
> with the "out-of-tree" driver. I have recognized a challenge posed by the kernel's
> memory management framework, which now imposes restrictions on huge faults
> for non-anonymous vma.

You should try upstreaming your driver possibly without huge fault 
support, and then separately try re-adding huge fault support and see if 
kernel people want to support that or not.

And if your driver *really* requires huge faults, then supporting that 
would be part of your series when upstreaming that driver.

-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ