linux-kernel - Re: [PATCH V4 2/3] x86/efi: Add efi page fault handler to recover from page faults caused by the firmware

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180907112147.GR24106@hirez.programming.kicks-ass.net>
Date:   Fri, 7 Sep 2018 13:21:48 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>
Cc:     linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org,
        x86@...nel.org, ricardo.neri@...el.com, matt@...eblueprint.co.uk,
        Al Stone <astone@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Ingo Molnar <mingo@...nel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Bhupesh Sharma <bhsharma@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>
Subject: Re: [PATCH V4 2/3] x86/efi: Add efi page fault handler to recover
 from page faults caused by the firmware

On Thu, Sep 06, 2018 at 04:27:47PM -0700, Sai Praneeth Prakhya wrote:
> @@ -790,6 +792,13 @@ no_context(struct pt_regs *regs, unsigned long error_code,
>  		return;
>  
>  	/*
> +	 * Buggy firmware could access regions which might page fault, try to
> +	 * recover from such faults.
> +	 */
> +	if (efi_recover_from_page_fault(address))
> +		return;
> +
> +	/*
>  	 * Oops. The kernel tried to access some bad page. We'll have to
>  	 * terminate things with extreme prejudice:
>  	 */

> +int efi_recover_from_page_fault(unsigned long phys_addr)
> +{
> +	/* Recover from page faults caused *only* by the firmware */
> +	if (current->active_mm != &efi_mm)
> +		return 0;
> +
> +	/*
> +	 * Address range 0x0000 - 0x0fff is always mapped in the efi_pgd, so
> +	 * page faulting on these addresses isn't expected.
> +	 */
> +	if (phys_addr >= 0x0000 && phys_addr <= 0x0fff)
> +		return 0;
> +
> +	/*
> +	 * Print stack trace as it might be useful to know which EFI Runtime
> +	 * Service is buggy.
> +	 */
> +	WARN(1, FW_BUG "Page fault caused by firmware at PA: 0x%lx\n",
> +	     phys_addr);
> +
> +	/*
> +	 * Buggy efi_reset_system() is handled differently from other EFI
> +	 * Runtime Services as it doesn't use efi_rts_wq. Although,
> +	 * native_machine_emergency_restart() says that machine_real_restart()
> +	 * could fail, it's better not to compilcate this fault handler
> +	 * because this case occurs *very* rarely and hence could be improved
> +	 * on a need by basis.
> +	 */
> +	if (efi_rts_work.efi_rts_id == RESET_SYSTEM) {
> +		pr_info("efi_reset_system() buggy! Reboot through BIOS\n");
> +		machine_real_restart(MRR_BIOS);
> +		return 0;
> +	}
> +
> +	/* Firmware has caused page fault, hence, freeze efi_rts_wq. */
> +	set_current_state(TASK_UNINTERRUPTIBLE);

This doesn't freeze it, as such, it just sets the state.

> +
> +	/*
> +	 * Before calling EFI Runtime Service, the kernel has switched the
> +	 * calling process to efi_mm. Hence, switch back to task_mm.
> +	 */
> +	arch_efi_call_virt_teardown();
> +
> +	/* Signal error status to the efi caller process */
> +	efi_rts_work.status = EFI_ABORTED;
> +	complete(&efi_rts_work.efi_rts_comp);
> +
> +	clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
> +	pr_info("Froze efi_rts_wq and disabled EFI Runtime Services\n");

> +	schedule();

So what happens when we get a spurious wakeup and return from this?

Quite possibly you want something like:

	for (;;) {
		set_current_state(TASK_IDLE);
		schedule();
	}

here. The TASK_UNINTERRUPTIBLE thing will cause the load-avg to spike;
is that what you want?

> +
> +	return 0;
> +}