lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 11 Nov 2020 18:47:36 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
Cc:     Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Damien Le Moal <Damien.LeMoal@....com>,
        andrew.cooper3@...rix.com, jgross@...e.com
Subject: Re: WARNING: can't access registers at asm_common_interrupt

On Wed, Nov 11, 2020 at 11:05:36AM -0600, Josh Poimboeuf wrote:
> On Fri, Nov 06, 2020 at 06:04:15AM +0000, Shinichiro Kawasaki wrote:
> > Greetings,
> > 
> > I observe "WARNING: can't access registers at asm_common_interrupt+0x1e/0x40"
> > in my kernel test system repeatedly, which is printed by unwind_next_frame() in
> > "arch/x86/kernel/unwind_orc.c". Syzbot already reported that [1]. Similar
> > warning was reported and discussed [2], but I suppose the cause is not yet
> > clarified.
> > 
> > The warning was observed with v5.10-rc2 and older tags. I bisected and found
> > that the commit 044d0d6de9f5 ("lockdep: Only trace IRQ edges") in v5.9-rc3
> > triggered the warning. Reverting that from 5.10-rc2, the warning disappeared.
> > May I ask comment by expertise on CC how this commit can relate to the warning?
> > 
> > The test condition to reproduce the warning is rather unique (blktests,
> > dm-linear and ZNS device emulation by QEMU). If any action is suggested for
> > further analysis, I'm willing to take it with my test system.
> > 
> > Wish this report helps.
> > 
> > [1] https://lkml.org/lkml/2020/9/6/231
> > [2] https://lkml.org/lkml/2020/9/8/1538
> 
> Shin'ichiro,
> 
> Thanks for all the data.  It looks like the ORC unwinder is getting
> confused by paravirt patching (with runtime-patched pushf/pop changing
> the stack layout).
> 
> <user interrupt>
> 	exit_to_user_mode_prepare()
> 		exit_to_user_mode_loop()
> 			local_irq_disable_exit_to_user()
> 				local_irq_disable()
> 					raw_irqs_disabled()
> 						arch_irqs_disabled()
> 							arch_local_save_flags()
> 								pushfq
> 								<another interrupt>

This is PARAVIRT_XXL only, which is a Xen special. My preference, as
always, is to kill it... Sadly the Xen people have a different opinion.

> Objtool doesn't know about the pushf/pop paravirt patch, so ORC gets
> confused by the changed stack layout.
> 
> I'm thinking we either need to teach objtool how to deal with
> save_fl/restore_fl patches, or we need to just get rid of those nasty
> patches somehow.  Peter, any thoughts?

Don't use Xen? ;-)

So with PARAVIRT_XXL the compiler will emit something like:

  "CALL *pvops.save_fl"

Which we then overwrite at runtime with "pushf; pop %[re]ax" and a few
NOPs.

Now, objtool understands alternatives, and ensures they have the same
stack layout, it has no chance in hell of understanding this, simply
because paravirt_patch.c is magic.

I don't have any immediate clever ideas, but let me ponder it a wee bit.

....

Something really disguisting we could do is recognise the indirect call
offset and emit an extra ORC entry for RIP+1. So the cases are:

	CALL *pv_ops.save_fl	-- 7 bytes IIRC
	CALL $imm;		-- 5 bytes
	PUSHF; POP %[RE]AX	-- 2 bytes

so the RIP+1 (the POP insn) will only ever exist in this case. The
indirect and direct call cases would never land on that IP.

....


> It looks like 044d0d6de9f5 ("lockdep: Only trace IRQ edges") is making
> the problem more likely, by adding the irqs_disabled() check for every
> local_irq_disable().
> 
> Also - Peter, Nicholas - is that irqs_disabled() check really necessary
> in local_irq_disable()?  Presumably irqs would typically be be enabled
> before calling it?

Yeah, so it's all a giant can of worms that; also see:

  https://lkml.kernel.org/r/20200821084738.508092956@infradead.org

The basic idea is to only trace edges, ie. when the hardware state
actually changes. Sadly this means doing a pushf/pop before the cli.
Ideally CLI would store the old IF in CF or something like that, but
alas.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ