lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5D5DEEED-55EB-457B-9EB7-C6D5B326FE99@vmware.com>
Date:   Mon, 27 Feb 2023 23:09:12 +0000
From:   Nadav Amit <namit@...are.com>
To:     Peter Xu <peterx@...hat.com>
CC:     Muhammad Usama Anjum <usama.anjum@...labora.com>,
        Mike Rapoport <rppt@...nel.org>,
        Michał Mirosław <emmir@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Cyrill Gorcunov <gorcunov@...il.com>,
        Paul Gofman <pgofman@...eweavers.com>,
        Danylo Mocherniuk <mdanylo@...gle.com>,
        Shuah Khan <shuah@...nel.org>,
        Christian Brauner <brauner@...nel.org>,
        Yang Shi <shy828301@...il.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Yun Zhou <yun.zhou@...driver.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Alex Sierra <alex.sierra@....com>,
        Matthew Wilcox <willy@...radead.org>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        "Gustavo A . R . Silva" <gustavoars@...nel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        kernel list <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        linux-kselftest <linux-kselftest@...r.kernel.org>,
        Greg KH <gregkh@...uxfoundation.org>,
        "kernel@...labora.com" <kernel@...labora.com>,
        David Hildenbrand <david@...hat.com>,
        Andrei Vagin <avagin@...il.com>
Subject: Re: [PATCH v10 3/6] fs/proc/task_mmu: Implement IOCTL to get and/or
 the clear info about PTEs



> On Feb 27, 2023, at 1:18 PM, Peter Xu <peterx@...hat.com> wrote:
> 
> !! External Email
> 
> On Thu, Feb 23, 2023 at 05:11:11PM +0000, Nadav Amit wrote:
>> From my experience with UFFD, proper ordering of events  is crucial, although it
>> is not always done well. Therefore, we should aim for improvement, not
>> regression. I believe that utilizing the pagemap-based mechanism for WP'ing
>> might be a step in the wrong direction. I think that it would have been better
>> to emit a 'UFFD_FEATURE_WP_ASYNC' WP-log (and ordered) with UFFD #PF and
>> events. The 'UFFD_FEATURE_WP_ASYNC'-log may not need to wake waiters on the
>> file descriptor unless the log is full.
> 
> Yes this is an interesting question to think about..
> 
> Keeping the data in the pgtable has one good thing that it doesn't need any
> complexity on maintaining the log, and no possibility of "log full".

I understand your concern, but I think that eventually it might be simpler
to maintain, since the logic of how to process the log is moved to userspace.

At the same time, handling inputs from pagemap and uffd handlers and sync’ing
them would not be too easy for userspace.

But yes, allocation on the heap for userfaultfd_wait_queue-like entries would
be needed, and there are some issues of ordering the events (I think all #PF
and other events should be ordered regardless) and how not to traverse all
async-userfaultfd_wait_queue’s (except those that block if the log is full)
when a wakeup is needed.

> 
> If there's possible "log full" then the next question is whether we should
> let the worker wait the monitor if the monitor is not fast enough to
> collect those data.  It adds some slight dependency on the two threads, I
> think it can make the tracking harder or impossible in latency sensitive
> workloads.

Again, I understand your concern. But this model that I propose is not new.
It is used with PML (page-modification logging) and KVM, and IIRC there is
a similar interface between KVM and QEMU to provide this information. There
are endless other examples for similar producer-consumer mechanisms that
might lead to stall in extreme cases. 

> 
> The other thing is we can also make the log "never gonna full" by making it
> a bitmap covering any registered ranges, but I don't either know whether
> it'll be worth it for the effort.

I do not see a benefit of half-log half-scan. It tries to take the
data-structure of one format and combine it with another.

Anyhow, I was just giving my 2 cents. Admittedly, I did not follow the
threads of previous versions and I did not see userspace components that
use the API to say something smart. Personally, I do not find the current
API proposal to be very consistent and simple, and it seems to me that it
lets pagemap do userfaultfd-related tasks, which might be considered
inappropriate and non-intuitive.

If I derailed the discussion, I apologize.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ