linux-kernel - Re: [RFC PATCH] x86/fault: move might

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87a6hsms2v.ffs@tglx>
Date:   Thu, 25 Nov 2021 11:45:44 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Muchun Song <songmuchun@...edance.com>,
        Dave Hansen <dave.hansen@...el.com>
Cc:     Dave Hansen <dave.hansen@...ux.intel.com>,
        Andrew Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Peter Anvin <hpa@...or.com>, X86 ML <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Xiongchun duan <duanxiongchun@...edance.com>,
        Qi Zheng <zhengqi.arch@...edance.com>
Subject: Re: [RFC PATCH] x86/fault: move might_sleep() out of mmap read lock

Muchun, Dave!

On Mon, Nov 22 2021 at 14:59, Muchun Song wrote:
> On Fri, Nov 19, 2021 at 11:04 PM Dave Hansen <dave.hansen@...el.com> wrote:
>> >
>> > +     might_sleep();
>> > +
>> >       /*
>> >        * Kernel-mode access to the user address space should only occur
>> >        * on well-defined single instructions listed in the exception
>> > @@ -1346,13 +1348,6 @@ void do_user_addr_fault(struct pt_regs *regs,
>> >               }
>> >  retry:
>> >               mmap_read_lock(mm);
>> > -     } else {
>> > -             /*
>> > -              * The above down_read_trylock() might have succeeded in
>> > -              * which case we'll have missed the might_sleep() from
>> > -              * down_read():
>> > -              */
>> > -             might_sleep();
>> >       }
>> >
>> >       vma = find_vma(mm, address);
>>
>> The comment is stale, which isn't great.  The might_sleep() is already
>> in the fast path.  So, moving it up above makes a lot of sense just in
>> terms of simplicity.

I don't think so. The point is:

	if (unlikely(!mmap_read_trylock(mm))) {
		if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
			/*
			 * Fault from code in kernel from
			 * which we do not expect faults.
			 */
			bad_area_nosemaphore(regs, error_code, address);
			return;
		}

Moving it up will make the might_sleep() splat more important than an
unexpected fault when the unexpected fault happens in e.g. a preemption
disabled region. That's wrong because the important information in this
case is not the might sleep splat. The important information is the
fault itself.

But moving it up is even more wrong for spurious faults which are
correctly handled in that case via:

     bad_area_nosemaphore()
       __bad_area_nosemaphore()
         kernelmode_fixup_or_oops()
            handle(AMD erratum #91)
              is_prefetch()

So if such a spurious fault happens in a condition which would trigger
the might_sleep() splat then moving might_sleep() before the trylock()
will cause false positives. So, no. It's going to stay where it is.

> Without this patch, I didn't see the might_sleep() in the fast path. What
> am I missing here?

I have no idea what you are doing. If the trylock() succeeds and the
fault happened in e.g. a preemption disabled region then the
might_sleep() in the else path will trigger no matter what.

Thanks,

        tglx