linux-kernel - Re: [2/2] x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in kconfig for 64-bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171020003822.GA14054@outlook.office365.com>
Date:   Thu, 19 Oct 2017 17:38:23 -0700
From:   Andrei Vagin <avagin@...tuozzo.com>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
Cc:     Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
        torvalds@...ux-foundation.org, peterz@...radead.org, hpa@...or.com,
        tglx@...utronix.de
Subject: Re: [2/2] x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in
 kconfig for 64-bit

On Thu, Oct 19, 2017 at 03:35:22PM -0700, Andrei Vagin wrote:
> On Thu, Oct 19, 2017 at 01:16:55PM -0500, Josh Poimboeuf wrote:
> > On Thu, Oct 19, 2017 at 09:51:04AM -0700, Andrei Vagin wrote:
> > > Hi,
> > > 
> > > We run CRIU tests for tip/auto-latest regularly, and a few days ago our
> > > test job started to detect this warning in a kernel log:
> > > 
> > > [   44.235786] WARNING: can't dereference iret registers at ffff8801c5f17fe0 for ip ffffffff95f0d94b
> > > 
> > > What does it mean? How critical is it?
> > > 
> > > Our test job fails if it detects any warning in a kernel log. Maybe we
> > > need to investigate reasons of this warning and try to eliminate it?
> > > 
> > > Here are logs:
> > > https://travis-ci.org/avagin/linux/jobs/289676634
> > 
> > I think it means the unwinder found some bad ORC unwinder metadata.  Any
> > chance you have access to the kernel binary?  I need to know what code
> > corresponds to that ffffffff95f0d94b address.
> > 
> > Or if you can reproduce with the following patch, that should help:
> > 
> > 
> > diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> > index 570b70d3f604..95b633f0ce51 100644
> > --- a/arch/x86/kernel/unwind_orc.c
> > +++ b/arch/x86/kernel/unwind_orc.c
> > @@ -448,7 +448,7 @@ bool unwind_next_frame(struct unwind_state *state)
> >  
> >  	case ORC_TYPE_REGS_IRET:
> >  		if (!deref_stack_regs(state, sp, &state->ip, &state->sp, false)) {
> > -			orc_warn("can't dereference iret registers at %p for ip %p\n",
> > +			orc_warn("can't dereference iret registers at %p for ip %pB\n",
> >  				 (void *)sp, (void *)orig_ip);
> >  			goto done;
> >  		}
> 
> I applied your patch and rerun tests.
> 
> [   44.947699] WARNING: can't dereference iret registers at ffff880178f5ffe0 for ip int3+0x5b/0x60
> 
> and now here is a warning from kasan:
> 
> [  477.775676] ==================================================================
> [  477.775845] BUG: KASAN: stack-out-of-bounds in deref_stack_reg+0x11d/0x150
> [  477.775952] Read of size 8 at addr ffff880166b7fe90 by task make/16028
> [  477.776055] 
> [  477.776149] CPU: 0 PID: 16028 Comm: make Not tainted 4.14.0-rc5+ #1
> [  477.776152] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [  477.776155] Call Trace:
> [  477.776159]  <IRQ>
> [  477.776167]  dump_stack+0x5c/0x7e
> [  477.776175]  print_address_description+0x6b/0x290
> [  477.776182]  ? deref_stack_reg+0x11d/0x150
> [  477.776186]  kasan_report+0x25d/0x340
> [  477.776194]  deref_stack_reg+0x11d/0x150
> [  477.776201]  ? __read_once_size_nocheck.constprop.6+0x10/0x10
> [  477.776206]  ? get_stack_info+0x37/0x170
> [  477.776212]  ? stack_access_ok+0xdc/0x150
> [  477.776221]  unwind_next_frame+0xe35/0x1c10
> [  477.776230]  ? do_execveat_common.isra.34+0x78e/0x1890
> [  477.776238]  ? deref_stack_reg+0x150/0x150
> [  477.776247]  ? is_bpf_text_address+0x54/0x60
> [  477.776253]  ? kernel_text_address+0xf4/0x100
> [  477.776257]  ? do_execveat_common.isra.34+0x78e/0x1890
> [  477.776266]  __save_stack_trace+0x73/0xd0
> [  477.776277]  ? do_execveat_common.isra.34+0x78e/0x1890
> [  477.776285]  save_stack+0x33/0xb0
> [  477.776291]  ? kasan_slab_free+0x70/0xc0
> [  477.776298]  ? kmem_cache_free+0x9f/0x230
> [  477.776303]  ? rcu_process_callbacks+0x451/0xd60
> [  477.776307]  ? __do_softirq+0x1d3/0x5e0
> [  477.776312]  ? irq_exit+0x146/0x170
> [  477.776322]  ? smp_apic_timer_interrupt+0x13e/0x3b0
> [  477.776326]  ? apic_timer_interrupt+0x8c/0xa0
> [  477.776331]  ? lock_acquire+0x6b/0x260
> [  477.776336]  ? do_execveat_common.isra.34+0x78e/0x1890
> [  477.776347]  ? update_curr+0x2d6/0x600
> [  477.776354]  ? posix_cpu_timers_exit_group+0x50/0x50
> [  477.776365]  ? trigger_load_balance+0x1fd/0x8a0
> [  477.776374]  ? note_gp_changes+0x14e/0x1b0
> [  477.776384]  ? lock_downgrade+0x590/0x590
> [  477.776389]  ? rcu_accelerate_cbs+0x106/0x5e0
> [  477.776398]  ? lock_acquire+0x113/0x260
> [  477.776402]  ? rcu_process_callbacks+0x407/0xd60
> [  477.776407]  kasan_slab_free+0x70/0xc0
> [  477.776414]  ? rcu_process_callbacks+0x451/0xd60
> [  477.776418]  kmem_cache_free+0x9f/0x230
> [  477.776425]  ? free_inode_nonrcu+0x20/0x20
> [  477.776430]  rcu_process_callbacks+0x451/0xd60
> [  477.776443]  ? note_gp_changes+0x1b0/0x1b0
> [  477.776451]  ? native_apic_msr_write+0x27/0x30
> [  477.776456]  ? lapic_next_event+0x55/0x80
> [  477.776465]  __do_softirq+0x1d3/0x5e0
> [  477.776479]  ? do_execveat_common.isra.34+0x78e/0x1890
> [  477.776483]  irq_exit+0x146/0x170
> [  477.776487]  smp_apic_timer_interrupt+0x13e/0x3b0
> [  477.776494]  apic_timer_interrupt+0x8c/0xa0
> [  477.776497]  </IRQ>
> [  477.776502] RIP: 0010:lock_acquire+0x6b/0x260
> [  477.776505] RSP: 0018:ffff880166b7fd48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
> [  477.776512] RAX: 0000000000000007 RBX: ffff8801c91cb080 RCX: 0000000000000000
> [  477.776515] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801c91cb8b4
> [  477.776518] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [  477.776521] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [  477.776524] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff9f651fce
> [  477.776528]  ? do_execveat_common.isra.34+0x78e/0x1890
> [  477.776552]  do_execveat_common.isra.34+0x78e/0x1890
> [  477.776559]  ? fs_reclaim_acquire.part.71+0x29/0x30
> [  477.776564]  ? fs_reclaim_acquire.part.71+0x5/0x30
> [  477.776569]  ? kmem_cache_alloc+0x29/0x1f0
> [  477.776577]  ? do_execveat_common.isra.34+0x78e/0x1890
> [  477.776589]  ? strncpy_from_user+0x74/0x260
> [  477.776595]  ? prepare_bprm_creds+0x100/0x100
> [  477.776599]  ? kmem_cache_alloc+0x18d/0x1f0
> [  477.776607]  ? getname_flags+0xff/0x500
> [  477.776615]  ? SyS_execve+0x2c/0x40
> [  477.776623]  ? ptregs_sys_vfork+0x10/0x10
> [  477.776628]  ? do_syscall_64+0x181/0x450
> [  477.776638]  ? entry_SYSCALL64_slow_path+0x25/0x25
> [  477.776653] 
> [  477.776747] The buggy address belongs to the page:
> [  477.776849] page:ffffea00059adfc0 count:0 mapcount:0 mapping:          (null) index:0x0
> [  477.776968] flags: 0x17fff8000000000()
> [  477.777067] raw: 017fff8000000000 0000000000000000 0000000000000000 00000000ffffffff
> [  477.777184] raw: 0000000000000000 dead000000000200 0000000000000000 0000000000000000
> [  477.777298] page dumped because: kasan: bad access detected
> [  477.777404] 
> [  477.777494] Memory state around the buggy address:
> [  477.777594]  ffff880166b7fd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  477.777709]  ffff880166b7fe00: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
> [  477.777823] >ffff880166b7fe80: f4 f4 f4 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00
> [  477.777937]                          ^
> [  477.778034]  ffff880166b7ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  477.778147]  ffff880166b7ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [  477.778260] ==================================================================
> [  477.778376] Disabling lock debugging due to kernel taint
> 
> All logs are here https://travis-ci.org/avagin/linux/jobs/290190646
> 
> Unfortunately vmlinux was not saved for this run. Thanks.

Here is vmlinux:
https://www.dropbox.com/s/e70u6oxxj4pwe2h/vmlinux?dl=0