lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 Jan 2018 14:28:06 +0800
From:   Baoquan He <bhe@...hat.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     linux-kernel@...r.kernel.org, fenghua.yu@...el.com,
        mingo@...hat.com, tglx@...utronix.de, hpa@...or.com,
        x86@...nel.org, rostedt@...dmis.org, jgross@...e.com,
        peterz@...radead.org, uobergfe@...hat.com, joro@...tes.org,
        myamazak@...hat.com
Subject: Re: [RESEND PATCH 0/3] x86/apic/kexec: Enable legacy irq mode before
 jump to kexec/kdump kernel

On 01/11/18 at 01:05pm, Eric W. Biederman wrote:
> Baoquan He <bhe@...hat.com> writes:
> 
> > Hi all,
> >
> > PING!
> >
> > (Add Fenghua and Eric to this thread)
> >
> > On 01/05/18 at 11:42am, Baoquan He wrote:
> >> On kvm guest, the latest kernel will always print warning during kdump kernel boots
> >> as below. The reaons is the legacy irq mode is disabled before jump to kexec/kdump
> >> kernel. So in setup_local_APIC(), the do { xxx } while (queued && max_loops > 0)
> >> can't handle if pending irq exists in APIC IRR since LAPIC is disabled. It will
> >> terminate the do while loop finally when max_loops overflows by subtraction. Then
> >> WARN_ON(max_loops <= 0) is triggered.
> 
> Overall this looks like the code is setup_local_APIC is working largely
> as designed.  It does run into a snag so it warns.
> 
> Which leaves the question:  Does QEMU have buggy APIC emulation in this
> case or is that loop simply incapble of dealing with queued interrupts
> in APIC_IRR.

Thanks a lot for looking into this, Eric!

Yes, as you said, setup_local_APIC() is working well. It assumes the
current apic can handle the queued interrupts in APIC_IRR. However,
in the current native_machine_crash_shutdown(), it calls
lapic_shutdown() which will invoke disable_local_APIC() to disable APIC
completely with below code. Then when kdump kernel comes into
setup_local_APIC(), the queued interrupts in APIC_IRR can not be handled
at all.

void disable_local_APIC(void)                                                                                      
{ 
......

        /*                                                                                                         
         * Disable APIC (implies clearing of registers                                                             
         * for 82489DX!).                                                                                          
         */                                                                                                        
        value = apic_read(APIC_SPIV);                                                                              
        value &= ~APIC_SPIV_APIC_ENABLED;                                                                          
        apic_write(APIC_SPIV, value);
}

With legacy irq mode enabled before jump to kdump kernel,
setup_local_APIC() can handle it well.

So if we decide to disable legacy mode before jump to kdump kernel, we
need remove the do { xxx } while (queued && max_loops > 0) code block
in setup_local_APIC(), and need change disable_IO_APIC() too since it is
doing thing which does not match its name. Just leave those pending irqs
till final apic mode is setup.

> 
> >> 
> >> [    0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 setup_local_APIC+0x228/0x330
> >> [    0.001000] Modules linked in:
> >> [    0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
> >> [    0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
> >> [    0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
> >> [    0.001000] RSP: 0000:ffffffffb6e03eb8 EFLAGS: 00010286
> >> [    0.001000] RAX: 0000009edb4c4d84 RBX: 0000000000000000 RCX: 00000000b099d800
> >> [    0.001000] RDX: 0000009e00000000 RSI: 0000000000000000 RDI: 0000000000000810
> >> [    0.001000] RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000001
> >> [    0.001000] R10: ffff98ce6a801c00 R11: 0761076d072f0776 R12: 0000000000000001
> >> [    0.001000] R13: 00000000000000f0 R14: 0000000000004000 R15: ffffffffffffc6ff
> >> [    0.001000] FS:  0000000000000000(0000) GS:ffff98ce6bc00000(0000) knlGS:0000000000000000
> >> [    0.001000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [    0.001000] CR2: 00000000ffffffff CR3: 0000000022209000 CR4: 00000000000406b0
> >> [    0.001000] Call Trace:
> >> [    0.001000]  apic_bsp_setup+0x56/0x74
> >> [    0.001000]  x86_late_time_init+0x11/0x16
> >> [    0.001000]  start_kernel+0x3c9/0x486
> >> [    0.001000]  secondary_startup_64+0xa5/0xb0
> >> [    0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41 
> >> [    0.001000] ---[ end trace b88e71b9a6ebebdd ]---
> >> [    0.001000] masked ExtINT on CPU#0
> >> 
> >> With patch 2/3 applied, the above warning disappeared. And with patch 2/3
> >> applied, the issue mentioned in patch 1/3 can also be fixed because the LAPIC
> >> has been set as ExtINT before jump to kdump kernel, while we had better set it
> >> explicitly. Seems no reason not to enable legacy irq mode before jump to
> >> kexec/kdump kernel, and can make it be consistent with normal kernel.
> >> 
> >> Patch 3/3 is doing clean up, I am fine if people think it's unnecessary.
> >> 
> 
> I don't see these patches so I am simply going on their description.
> 
> >> Baoquan He (3):
> >>   x86/apic: Set up through LAPIC on boot CPU's LINT0 if ioapic is
> >>     disabled
> 
> *scratches my head*  Are you booging the kexec on panic kernel with
			       ~~~ typo, should be 'booting'?
> apics disabled?  When the previous kernel had apics enabled?
> That makes my head really hurt if you are.  Don't do that.

No, not like that. Just the kernel paratmeter 'noapic' is really
misleading. It only disables the IO-APIC using in system. I got this bug
reported, 1st kernel works well with 'noapic' added, while kdump kernel
won't.

	*****************************************************************
        noapic          [SMP,APIC] Tells the kernel to not make use of any
                        IOAPICs that may be present in the system.

Thanks
Baoquan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ