lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <upvlv4b2zly56trmoaocs5gl34ykd7tjz2grzqtwkfy45gbm7l@uxsmqdjgyo5n>
Date: Fri, 31 Jan 2025 11:14:06 +1100
From: Alistair Popple <apopple@...dia.com>
To: David Hildenbrand <david@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, 
	dri-devel@...ts.freedesktop.org, linux-mm@...ck.org, nouveau@...ts.freedesktop.org, 
	Andrew Morton <akpm@...ux-foundation.org>, Jérôme Glisse <jglisse@...hat.com>, 
	Jonathan Corbet <corbet@....net>, Alex Shi <alexs@...nel.org>, Yanteng Si <si.yanteng@...ux.dev>, 
	Karol Herbst <kherbst@...hat.com>, Lyude Paul <lyude@...hat.com>, 
	Danilo Krummrich <dakr@...nel.org>, David Airlie <airlied@...il.com>, 
	Simona Vetter <simona@...ll.ch>, "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>, 
	Pasha Tatashin <pasha.tatashin@...een.com>, Peter Xu <peterx@...hat.com>, Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [PATCH v1 4/4] mm/memory: document restore_exclusive_pte()

On Thu, Jan 30, 2025 at 04:29:33PM +0100, David Hildenbrand wrote:
> On 30.01.25 14:31, Simona Vetter wrote:
> > On Thu, Jan 30, 2025 at 10:37:06AM +0100, David Hildenbrand wrote:
> > > On 30.01.25 01:27, Alistair Popple wrote:
> > > > On Wed, Jan 29, 2025 at 12:58:02PM +0100, David Hildenbrand wrote:
> > > > > Let's document how this function is to be used, and why the requirement
> > > > > for the folio lock might maybe be dropped in the future.
> > > > 
> > > > Sorry, only just catching up on your other thread. The folio lock was to ensure
> > > > the GPU got a chance to make forward progress by mapping the page. Without it
> > > > the CPU could immediately invalidate the entry before the GPU had a chance to
> > > > retry the fault.
> > > > > Obviously performance wise having such thrashing is terrible, so should
> > > > really be avoided by userspace, but the lock at least allowed such programs
> > > > to complete.
> > > 
> > > Thanks for the clarification. So it's relevant that the MMU notifier in
> > > remove_device_exclusive_entry() is sent after taking the folio lock.
> > > 
> > > However, as soon as we drop the folio lock, remove_device_exclusive_entry()
> > > will become active, lock the folio and trigger the MMU notifier.
> > > 
> > > So the time it is actually mapped into the device is rather
> 
> I meant to say "rather short." :)
> 
> > 
> > Looks like you cut off a bit here (or mail transport did that somewhere),
> > but see my other reply I don't think this is a legit use-case. So we don't
> > have to worry.
> 
> In that case, we would need the folio lock in the future.
> 
> > Well beyond documenting that if userspace concurrently thrashes
> > the same page with both device atomics and cpu access it will stall real
> > bad.
> 
> I'm curious, is locking between device-cpu or device-device something that
> can happen frequently? In that case, you would get that trashing naturally?

It results in terrible performance so in practice it isn't something that I've
seen except when stress testing the driver. Those stress tests were useful for
exposing a range of kernel/driver bugs/issues though, and despite the short time
it is mapped the lock was sufficient to allow atomic thrashing tests to complete
vs. having the device fault endlessly.

So unless it's making things more difficult I'd rather keep the lock.

> -- 
> Cheers,
> 
> David / dhildenb
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ