lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c936083b-68b7-4d8f-a8fc-d188e646f390@redhat.com>
Date: Fri, 19 Apr 2024 11:45:14 +0200
From: David Hildenbrand <david@...hat.com>
To: Shivansh Vij <shivanshvij@...look.com>,
 Ryan Roberts <ryan.roberts@....com>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Andrew Morton <akpm@...ux-foundation.org>, Shuah Khan <shuah@...nel.org>,
 Joey Gouly <joey.gouly@....com>, Ard Biesheuvel <ardb@...nel.org>,
 Mark Rutland <mark.rutland@....com>,
 Anshuman Khandual <anshuman.khandual@....com>,
 Mike Rapoport <rppt@...ux.ibm.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "linux-arm-kernel@...ts.infradead.org"
 <linux-arm-kernel@...ts.infradead.org>,
 "linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH v1 0/5] arm64/mm: uffd write-protect and soft-dirty
 tracking

On 19.04.24 10:33, Shivansh Vij wrote:
> (Sorry about the previous HTML email, accidentally used the wrong email client)
> 
> Hey All,
> 
>> On 19/04/2024 08:43, Ryan Roberts wrote:
>>> Hi All,
>>>
>>> This series adds uffd write-protect and soft-dirty tracking support for arm64. I
>>> consider the soft-dirty support (patches 3 and 4) as RFC - see rationale below.
>>>
>>> Previous attempts to add these features have failed because of a perceived lack
>>> of available PTE SW bits. However it actually turns out that there are 2
>>> available but they are hidden. PTE_PROT_NONE was previously occupying a SW bit,
>>> but it only applies when PTE_VALID is clear, so this is moved to overlay PTE_UXN
>>> in patch 1, freeing up the SW bit. Bit 63 is marked as "IGNORED" in the Arm ARM,
>>> but it does not currently indicate "reserved for SW use" like it does for the
>>> other SW bits. I've confirmed with the spec owner that this is an oversight; the
>>> bit is intended to be reserved for SW use and the spec will clarify this in a
>>> future update.
>>>
>>> So we have our two bits; patch 2 enables uffd-wp, patch 3 enables soft-dirty and
>>> patches 4 and 5 sort out the selftests so that the soft-dirty tests are compiled
>>> for, and run on arm64.
>>>
>>> That said, these are the last 2 SW bits and we may want to keep 1 bit in reserve
>>> for future use. soft-dirty is only used for CRIU to my knowledge, and it is
>>> thought that their use case could be solved with the more generic uffd-wp. So
>>> unless somebody makes a clear case for the inclusion of soft-dirty support, we
>>> are probably better off dropping patches 3 and 4 and keeping bit 63 for future
>>> use. Although note that the most recent attempt to add soft-dirty for arm64 was
>>> last month [1] so I'd like to give Shivansh Vij the opportunity to make the
>>> case.
>>
>> Ugh, forgot to mention that this applies on top of v6.9-rc3, and all the uffd-wp
>> and soft-dirty tests in the mm selftests suite run and pass. And no regressions
>> are observed in any of the other selftests.
> 
> Appreciate the opportunity to provide input here.
> 
> I personally don't know of any applications other than CRIU that make heavy use of soft-dirty, and my use case is specifically focused on adding live-migration support to CRIU on ARM.
> 
> Cloud providers like AWS have pretty massive discounts for ARM-based spot instances (90% last time I checked), and having live-migration in CRIU would allow more applications to take advantage of that.
> 
> As Ryan mentioned, there are two ways to achieve this - add dirty tracking to ARM (Patch 3/4), or tear out the existing dirty tracking code in CRIU and replace it with uffd-wp.
> 
> I picked option one (dirty tracking in arm) because it seems to be the simplest way to move forward, whereas it would be a relatively heavy effort to add uffd-wp support to CRIU.
> 
>  From a performance perspective I am also a little worried that uffd will be slower than just tracking the dirty bits asynchronously with sw dirty, but maybe that's not as much of a concern with the addition of uffd-wp async.
> 
> With all this being said, I'll defer to the wisdom of the crowd about which approach makes more sense - after all, with this patch we should get uffd-wp support on arm so at least there will be _a_ way forward for CRIU (albeit one requiring slightly more work).

Ccing Mike and Peter. In 2017, Mike gave a presentation "Memory tracking 
for iterative container migration"[1] at LPC

Some key points are still true I think:
(1) More flexible and robust than soft-dirty
(2) May obsolete soft-dirty

We further recently added a new UFFD_FEATURE_WP_ASYNC feature as part of 
[2], because getting soft-dirty return reliable results in some cases 
turned out rather hard to fix.

We might still have to optimize that approach for some very sparse large 
VMAs, but that should be solvable.

  "The major defect of this approach of dirty tracking is we need to
  populate the pgtables when tracking starts. Soft-dirty doesn't do it
  like that. It's unwanted in the case where the range of memory to track
  is huge and unpopulated (e.g., tracking updates on a 10G file with
  mmap() on top, without having any page cache installed yet). One way to
  improve this is to allow pte markers exist for larger than PTE level
  for PMD+. That will not change the interface if to implemented, so we
  can leave that for later.")[3]


If we can avoid adding soft-dirty on arm64 that would be great. This 
will require work on the CRIU side. One downside of uffd-wp is that it 
is currently not as avilable on architectures as soft-dirty.

But I'll throw in another idea: do we really need soft-dirty and uffd-wp 
to exist at the same time in the same process (or the VMA?). In theory, 
we could have a VMA flag that defines the semantics of the bit and 
simply have arch code use a single, abstracted PTE bit. Requires a bit 
more work, though, but the benfit would be that architecturs that do 
support soft-dirty could support uffd-wp.


[1] 
https://blog.linuxplumbersconf.org/2017/ocw//system/presentations/4724/original/Memory%20tracking%20for%20iterative%20container%20migration.pdf
[2] 
https://lore.kernel.org/all/20230821141518.870589-1-usama.anjum@collabora.com/
[3] 
https://lore.kernel.org/all/20230821141518.870589-2-usama.anjum@collabora.com/

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ