[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6c952be3-8be8-c4c9-a1f9-ddec027645bf@shipmail.org>
Date: Thu, 25 Mar 2021 19:42:13 +0100
From: Thomas Hellström (Intel)
<thomas_os@...pmail.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Dave Hansen <dave.hansen@...el.com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"christian.koenig@....com" <christian.koenig@....com>,
"airlied@...ux.ie" <airlied@...ux.ie>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
Subject: Re: [RFC PATCH 1/2] mm,drm/ttm: Block fast GUP to TTM huge pages
On 3/25/21 7:24 PM, Jason Gunthorpe wrote:
> On Thu, Mar 25, 2021 at 07:13:33PM +0100, Thomas Hellström (Intel) wrote:
>> On 3/25/21 6:55 PM, Jason Gunthorpe wrote:
>>> On Thu, Mar 25, 2021 at 06:51:26PM +0100, Thomas Hellström (Intel) wrote:
>>>> On 3/24/21 9:25 PM, Dave Hansen wrote:
>>>>> On 3/24/21 1:22 PM, Thomas Hellström (Intel) wrote:
>>>>>>> We also have not been careful at *all* about how _PAGE_BIT_SOFTW* are
>>>>>>> used. It's quite possible we can encode another use even in the
>>>>>>> existing bits.
>>>>>>>
>>>>>>> Personally, I'd just try:
>>>>>>>
>>>>>>> #define _PAGE_BIT_SOFTW5 57 /* available for programmer */
>>>>>>>
>>>>>> OK, I'll follow your advise here. FWIW I grepped for SW1 and it seems
>>>>>> used in a selftest, but only for PTEs AFAICT.
>>>>>>
>>>>>> Oh, and we don't care about 32-bit much anymore?
>>>>> On x86, we have 64-bit PTEs when running 32-bit kernels if PAE is
>>>>> enabled. IOW, we can handle the majority of 32-bit CPUs out there.
>>>>>
>>>>> But, yeah, we don't care about 32-bit. :)
>>>> Hmm,
>>>>
>>>> Actually it makes some sense to use SW1, to make it end up in the same dword
>>>> as the PSE bit, as from what I can tell, reading of a 64-bit pmd_t on 32-bit
>>>> PAE is not atomic, so in theory a huge pmd could be modified while reading
>>>> the pmd_t making the dwords inconsistent.... How does that work with fast
>>>> gup anyway?
>>> It loops to get an atomic 64 bit value if the arch can't provide an
>>> atomic 64 bit load
>> Hmm, ok, I see a READ_ONCE() in gup_pmd_range(), and then the resulting pmd
>> is dereferenced either in try_grab_compound_head() or __gup_device_huge(),
>> before the pmd is compared to the value the pointer is currently pointing
>> to. Couldn't those dereferences be on invalid pointers?
> Uhhhhh.. That does look questionable, yes. Unless there is some tricky
> reason why a 64 bit pmd entry on a 32 bit arch either can't exist or
> has a stable upper 32 bits..
>
> The pte does it with ptep_get_lockless(), we probably need the same
> for the other levels too instead of open coding a READ_ONCE?
>
> Jason
Yes, unless that comment before local_irq_disable() means some magic is
done to prevent bad things happening, but I guess if it's needed for
ptes, it's probably needed for pmds and puds as well.
/Thomas
Powered by blists - more mailing lists