[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <E1A7D23D-EFCB-482F-9D9D-F8D92F1630D2@fb.com>
Date: Wed, 29 Mar 2023 16:53:58 +0000
From: Song Liu <songliubraving@...a.com>
To: Matthew Wilcox <willy@...radead.org>
CC: Song Liu <songliubraving@...a.com>,
Hugh Dickins <hughd@...gle.com>, Song Liu <song@...nel.org>,
Jiri Olsa <jolsa@...nel.org>,
David Stevens <stevensd@...omium.org>,
Linux-MM <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Xu <peterx@...hat.com>,
"Kirill A . Shutemov" <kirill@...temov.name>,
Yang Shi <shy828301@...il.com>,
David Hildenbrand <david@...hat.com>,
Jiaqi Yan <jiaqiyan@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 3/3] mm/khugepaged: maintain page cache uptodate flag
> On Mar 24, 2023, at 6:31 AM, Matthew Wilcox <willy@...radead.org> wrote:
>
> On Fri, Mar 24, 2023 at 06:03:37AM +0000, Song Liu wrote:
>>
>>
>>> On Mar 23, 2023, at 8:30 PM, Matthew Wilcox <willy@...radead.org> wrote:
>>
>> [...]
>>
>>>
>>> The Uptodate flag check needs to be done by the caller; the
>>> find_get_page() family return !uptodate pages.
>>>
>>> But find_get_page() does not advertise itself as NMI-safe. And I
>>> think it's wrong to try to make it NMI-safe. Most of the kernel is
>>> not NMI-safe. I think it's incumbent on the BPF people to get the
>>> information they need ahead of taking the NMI. NMI handlers are not
>>> supposed to be doing a huge amount of work! I don't really understand
>>> why it needs to do work in NMI context; surely it can note the location of
>>> the fault and queue work to be done later (eg on irq-enable, task-switch
>>> or return-to-user)
>>
>> The use case here is a profiler (similar to perf-record). Parsing the
>> build id in side the NMI makes the profiler a lot simpler. Otherwise,
>> we will need some post processing for each sample.
>
> Simpler for you, maybe. But this is an NMI! It's not supposed to
> be doing printf-formatting or whatever, much less poking around
> in the file cache. Like perf, it should record a sample and then
> convert that later. Maybe it can defer to a tasklet, but i think
> scheduling work is a better option.
>
>> OTOH, it is totally fine if build_id_parse() fails some time, say < 5%.
>> The profiler output is still useful in such cases.
>>
>> I guess the next step is to replace find_get_page() with a NMI-safe
>> version?
>
> No, absolutely not. Stop doing so much work in an NMI.
While I understand the concern, it is not something we can easily remove,
as there are users rely on this feature. How about we discuss this at
upcoming LSFMMBPF?
Thanks,
Song
Powered by blists - more mailing lists