linux-kernel - Re: [PATCH 1/5] mm: Check if mmu notifier callbacks are allowed to fail

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e917a6f3-463b-0abf-66b7-d4934dbb3af9@nvidia.com>
Date:   Wed, 14 Aug 2019 16:34:58 -0700
From:   Ralph Campbell <rcampbell@...dia.com>
To:     Andrew Morton <akpm@...ux-foundation.org>,
        Daniel Vetter <daniel.vetter@...ll.ch>
CC:     LKML <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
        DRI Development <dri-devel@...ts.freedesktop.org>,
        Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
        Michal Hocko <mhocko@...e.com>,
        Christian König <christian.koenig@....com>,
        David Rientjes <rientjes@...gle.com>,
        Jérôme Glisse <jglisse@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Daniel Vetter <daniel.vetter@...el.com>
Subject: Re: [PATCH 1/5] mm: Check if mmu notifier callbacks are allowed to
 fail


On 8/14/19 3:14 PM, Andrew Morton wrote:
> On Wed, 14 Aug 2019 22:20:23 +0200 Daniel Vetter <daniel.vetter@...ll.ch> wrote:
> 
>> Just a bit of paranoia, since if we start pushing this deep into
>> callchains it's hard to spot all places where an mmu notifier
>> implementation might fail when it's not allowed to.
>>
>> Inspired by some confusion we had discussing i915 mmu notifiers and
>> whether we could use the newly-introduced return value to handle some
>> corner cases. Until we realized that these are only for when a task
>> has been killed by the oom reaper.
>>
>> An alternative approach would be to split the callback into two
>> versions, one with the int return value, and the other with void
>> return value like in older kernels. But that's a lot more churn for
>> fairly little gain I think.
>>
>> Summary from the m-l discussion on why we want something at warning
>> level: This allows automated tooling in CI to catch bugs without
>> humans having to look at everything. If we just upgrade the existing
>> pr_info to a pr_warn, then we'll have false positives. And as-is, no
>> one will ever spot the problem since it's lost in the massive amounts
>> of overall dmesg noise.
>>
>> ...
>>
>> --- a/mm/mmu_notifier.c
>> +++ b/mm/mmu_notifier.c
>> @@ -179,6 +179,8 @@ int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
>>   				pr_info("%pS callback failed with %d in %sblockable context.\n",
>>   					mn->ops->invalidate_range_start, _ret,
>>   					!mmu_notifier_range_blockable(range) ? "non-" : "");
>> +				WARN_ON(mmu_notifier_range_blockable(range) ||
>> +					ret != -EAGAIN);
>>   				ret = _ret;
>>   			}
>>   		}
> 
> A problem with WARN_ON(a || b) is that if it triggers, we don't know
> whether it was because of a or because of b.  Or both.  So I'd suggest
> 
> 	WARN_ON(a);
> 	WARN_ON(b);
> 

This won't quite work. It is OK to have 
mmu_notifier_range_blockable(range) be true or false.
sync_cpu_device_pagetables() shouldn't return
-EAGAIN unless blockable is true.