linux-kernel - Re: [git pull] drm fixes for 6.11-rc6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5c493bd5-e657-4241-81d7-19ccd380b379@amd.com>
Date: Mon, 2 Sep 2024 12:33:52 +0200
From: Christian König <christian.koenig@....com>
To: Thomas Hellström <thomas.hellstrom@...ux.intel.com>,
 Dave Airlie <airlied@...il.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Daniel Vetter <daniel.vetter@...ll.ch>,
 dri-devel <dri-devel@...ts.freedesktop.org>,
 LKML <linux-kernel@...r.kernel.org>,
 Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@....com>,
 Alex Deucher <alexdeucher@...il.com>, lingshan.zhu@....com,
 Matthew Brost <matthew.brost@...el.com>
Subject: Re: [git pull] drm fixes for 6.11-rc6

Am 02.09.24 um 11:32 schrieb Thomas Hellström:
> On Mon, 2024-09-02 at 08:13 +1000, Dave Airlie wrote:
>> On Fri, 30 Aug 2024 at 12:32, Linus Torvalds
>> <torvalds@...ux-foundation.org> wrote:
>>> On Fri, 30 Aug 2024 at 14:08, Dave Airlie <airlied@...il.com>
>>> wrote:
>>>> The TTM revert is due to some stuttering graphical apps probably
>>>> due
>>>> to longer stalls while prefaulting.
>>> Yeah, trying to pre-fault a PMD worth of pages in one go is just
>>> crazy talk.
>>>
>>> Now, if it was PMD-aligned and you faulted in a single PMD, that
>>> would
>>> be different. But just doing prn_insert_page() in a loop is insane.
>>>
>>> The code doesn't even stop when it hits a page that already
>>> existed,
>>> and it keeps locking and unlocking the last-level page table over
>>> and
>>> over again.
>>>
>>> Honestly, that code is questionable even for the *small* value,
>>> much
>>> less the "a PMD size" case.
>>>
>>> Now, if you have an array of 'struct page *", you can use
>>> vm_insert_pages(), and that's reasonably efficient.
>>>
>>> And if you have a *contiguous* are of pfns, you can use
>>> remap_pfn_range().
>>>
>>> But that "insert one pfn at a time" that the drm layer does is
>>> complete garbage. You're not speeding anything up, you're just
>>> digging
>>> deeper.
>
>> I wonder if there is functionality that could be provided in a common
>> helper, by the mm layers, or if there would be too many locking
>> interactions to make it sane,
>>
>> It seems too fraught with danger for drivers or subsystems to be just
>> doing this in the simplest way that isn't actually that smart.
> Hmm. I see even the "Don't error on prefaults" check was broken at some
> point :/.
>
> There have been numerous ways to try to address this,
>
> The remap_pfn_range was last tried, at least in the context of the i915
> driver IIRC by Christoph Hellwig but had to be ripped out since it
> requires the mmap_lock in write mode. Here we have it only in read
> mode.
>
> Then there's the apply_to_page_range() used by the igfx functionality
> of the i915 driver. I don't think we should go that route without
> turning it into something like vm_insert_pfns() with proper checking.
> This approach populates all entries of a buffer object.
>
> Finally there's the huge fault attempt that had to be ripped out due to
> lack of pmd_special and pud_special flags and resulting clashes with
> gup_fast.
>
> Perhaps a combination of the two latter if properly implemented would
> be the best choice.

I'm not deep enough into the memory management background to judge which 
approach is the best, just one more data point to provide:

The pre-faulting was increased because of virtualization. When KVM/XEN 
is mapping a BO into a guest the switching overhead for each fault is so 
high that mapping a lot of PFNs at the same time becomes beneficial.

Regards,
Christian.

>
> /Thomas
>
>> Dave.