linux-kernel - Re: [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91ad1106-6bd4-7d2c-4d40-7c5be945ba36@amd.com>
Date:   Mon, 2 Jul 2018 14:39:50 +0200
From:   Christian König <christian.koenig@....com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        "David (ChunMing) Zhou" <David1.Zhou@....com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Alex Deucher <alexander.deucher@....com>,
        David Airlie <airlied@...ux.ie>,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Doug Ledford <dledford@...hat.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Mike Marciniszyn <mike.marciniszyn@...el.com>,
        Dennis Dalessandro <dennis.dalessandro@...el.com>,
        Sudeep Dutt <sudeep.dutt@...el.com>,
        Ashutosh Dixit <ashutosh.dixit@...el.com>,
        Dimitri Sivanich <sivanich@....com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Juergen Gross <jgross@...e.com>,
        Jérôme Glisse <jglisse@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>, kvm@...r.kernel.org,
        amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        intel-gfx@...ts.freedesktop.org, linux-rdma@...r.kernel.org,
        xen-devel@...ts.xenproject.org, linux-mm@...ck.org,
        David Rientjes <rientjes@...gle.com>,
        Felix Kuehling <felix.kuehling@....com>
Subject: Re: [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

Am 02.07.2018 um 14:35 schrieb Michal Hocko:
> On Mon 02-07-18 14:24:29, Christian König wrote:
>> Am 02.07.2018 um 14:20 schrieb Michal Hocko:
>>> On Mon 02-07-18 14:13:42, Christian König wrote:
>>>> Am 02.07.2018 um 13:54 schrieb Michal Hocko:
>>>>> On Mon 02-07-18 11:14:58, Christian König wrote:
>>>>>> Am 27.06.2018 um 09:44 schrieb Michal Hocko:
>>>>>>> This is the v2 of RFC based on the feedback I've received so far. The
>>>>>>> code even compiles as a bonus ;) I haven't runtime tested it yet, mostly
>>>>>>> because I have no idea how.
>>>>>>>
>>>>>>> Any further feedback is highly appreciated of course.
>>>>>> That sounds like it should work and at least the amdgpu changes now look
>>>>>> good to me on first glance.
>>>>>>
>>>>>> Can you split that up further in the usual way? E.g. adding the blockable
>>>>>> flag in one patch and fixing all implementations of the MMU notifier in
>>>>>> follow up patches.
>>>>> But such a code would be broken, no? Ignoring the blockable state will
>>>>> simply lead to lockups until the fixup parts get applied.
>>>> Well to still be bisect-able you only need to get the interface change in
>>>> first with fixing the function signature of the implementations.
>>> That would only work if those functions return -AGAIN unconditionally.
>>> Otherwise they would pretend to not block while that would be obviously
>>> incorrect. This doesn't sound correct to me.
>>>
>>>> Then add all the new code to the implementations and last start to actually
>>>> use the new interface.
>>>>
>>>> That is a pattern we use regularly and I think it's good practice to do
>>>> this.
>>> But we do rely on the proper blockable handling.
>> Yeah, but you could add the handling only after you have all the
>> implementations in place. Don't you?
> Yeah, but then I would be adding a code with no user. And I really
> prefer to no do so because then the code is harder to argue about.
>
>>>>> Is the split up really worth it? I was thinking about that but had hard
>>>>> times to end up with something that would be bisectable. Well, except
>>>>> for returning -EBUSY until all notifiers are implemented. Which I found
>>>>> confusing.
>>>> It at least makes reviewing changes much easier, cause as driver maintainer
>>>> I can concentrate on the stuff only related to me.
>>>>
>>>> Additional to that when you cause some unrelated side effect in a driver we
>>>> can much easier pinpoint the actual change later on when the patch is
>>>> smaller.
>>>>
>>>>>> This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd
>>>>>> changes.
>>>>> If you are worried to give r-b only for those then this can be done even
>>>>> for larger patches. Just make your Reviewd-by more specific
>>>>> R-b: name # For BLA BLA
>>>> Yeah, possible alternative but more work for me when I review it :)
>>> I definitely do not want to add more work to reviewers and I completely
>>> see how massive "flag days" like these are not popular but I really
>>> didn't find a reasonable way around that would be both correct and
>>> wouldn't add much more churn on the way. So if you really insist then I
>>> would really appreciate a hint on the way to achive the same without any
>>> above downsides.
>> Well, I don't insist on this. It's just from my point of view that this
>> patch doesn't needs to be one patch, but could be split up.
> Well, if there are more people with the same concern I can try to do
> that. But if your only concern is to focus on your particular part then
> I guess it would be easier both for you and me to simply apply the patch
> and use git show $files_for_your_subystem on your end. I have put the
> patch to attempts/oom-vs-mmu-notifiers branch to my tree at
> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git

Not wanting to block something as important as this, so feel free to add 
an Acked-by: Christian König <christian.koenig@....com> to the patch.

Let's rather face the next topic: Any idea how to runtime test this?

I mean I can rather easily provide a test which crashes an AMD GPU, 
which in turn then would mean that the MMU notifier would block forever 
without this patch.

But do you know a way to let the OOM killer kill a specific process?

Regards,
Christian.