lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZqfJmUF8sXIyuSHN@gmail.com>
Date: Mon, 29 Jul 2024 09:55:53 -0700
From: Breno Leitao <leitao@...ian.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
	"H. Peter Anvin" <hpa@...or.com>, leit@...a.com,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>,
	Wei Liu <wei.liu@...nel.org>, Marc Zyngier <maz@...nel.org>,
	Adrian Huang <ahuang12@...ovo.com>,
	"open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/apic: Add retry mechanism to add_pin_to_irq_node()

Hello Thomas,

On Mon, Jul 29, 2024 at 06:13:34PM +0200, Thomas Gleixner wrote:
> On Mon, Jul 29 2024 at 07:06, Breno Leitao wrote:
> > I've been running some experiments with failslab fault injector running
> > to detect a different problem, and the machine always crash with the
> > following stack:
> >
> > 	can not alloc irq_pin_list (-1,0,20)
> > 	Kernel panic - not syncing: IO-APIC: failed to add irq-pin. Can not proceed
> >
> > 	Call Trace:
> > 	 panic
> > 	   _printk
> > 	   panic_smp_self_stop
> > 	   rcu_is_watching
> > 	   intel_irq_remapping_free
> 
> This completely lacks context. When does this happen? What's the system
> state? What has intel_irq_remapping_free() to do with the allocation path?

Sorry, let me clarify it a bit better:

1) This happens when the machine is booted up, and being under stress
2) This happens when I have failslab fault injection enabled.
3) The machine crashes after hitting this error.
4) This is reproducible with `stress-ng` using the `--aggressive` parameter
5) This is the full stack (sorry for not decoding the stack, but if you
   need it, I am more than happy to give you a decoded stack)


 04:12:34  can not alloc irq_pin_list (-1,0,20)
           Kernel panic - not syncing: IO-APIC: failed to add irq-pin. Can not proceed
           CPU: 11 UID: 0 PID: 335023 Comm: stress-ng-dev Kdump: loaded Tainted: G S          E    N 6.10.0-12563-gdb0610128a16 #48

           Call Trace:
            <TASK>
            panic+0x4e9/0x590
            ? _printk+0xb3/0xe0
            ? panic_smp_self_stop+0x70/0x70
            ? rcu_is_watching+0xe/0xb0
            ? intel_irq_remapping_free+0x30/0x30
            ? __add_pin_to_irq_node+0xf4/0x2d0
            ? rcu_is_watching+0xe/0xb0
            mp_irqdomain_alloc+0x9ab/0xa80
            ? IO_APIC_get_PCI_irq_vector+0x850/0x850
            ? __kmalloc_cache_node_noprof+0x1e0/0x360
            ? mutex_lock_io_nested+0x1420/0x1420
            irq_domain_alloc_irqs_locked+0x25d/0x8d0
            __irq_domain_alloc_irqs+0x80/0x110
            mp_map_pin_to_irq+0x645/0x890
            ? __acpi_get_override_irq+0x350/0x350
            ? mutex_lock_io_nested+0x1420/0x1420
            ? lockdep_hardirqs_on_prepare+0x400/0x400
            ? mp_map_gsi_to_irq+0xe6/0x1b0
            acpi_register_gsi_ioapic+0xe6/0x150
            ? acpi_unregister_gsi_ioapic+0x40/0x40
            ? mark_held_locks+0x9f/0xe0
            ? _raw_spin_unlock_irq+0x24/0x50
            hpet_open+0x313/0x480
            misc_open+0x306/0x420
            chrdev_open+0x218/0x660
            ? __unregister_chrdev+0xe0/0xe0
            ? security_file_open+0x3d4/0x740
            do_dentry_open+0x4a1/0x1300
            ? __unregister_chrdev+0xe0/0xe0
            vfs_open+0x7e/0x350
            path_openat+0xb46/0x2740
            ? kernel_tmpfile_open+0x60/0x60
            ? lock_acquire+0x1e4/0x650
            do_filp_open+0x1af/0x3e0
            ? path_openat+0x2740/0x2740
            ? do_raw_spin_lock+0x12d/0x270
            ? spin_bug+0x1d0/0x1d0
            ? _raw_spin_unlock+0x29/0x40
            ? alloc_fd+0x1e6/0x640
            do_sys_openat2+0x117/0x150
            ? build_open_flags+0x450/0x450
            ? lock_downgrade+0x690/0x690
            __x64_sys_openat+0x11f/0x1d0
            ? __x64_sys_open+0x1a0/0x1a0
            ? do_syscall_64+0x36/0x190
            do_syscall_64+0x6e/0x190
            entry_SYSCALL_64_after_hwframe+0x4b/0x53
           RIP: 0033:0x7f6c406fd784
           Code: 24 20 eb 8f 66 90 44 89 54 24 0c e8 d6 88 f8 ff 44 8b 54 24 0c 44 89 e2 48 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 89 44 24 0c e8 28 89 f8 ff 8b 44
           RSP: 002b:00007fff72413a70 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
           RAX: ffffffffffffffda RBX: 00007f6c408c43a8 RCX: 00007f6c406fd784
           RDX: 0000000000000800 RSI: 000055759a5fc910 RDI: 00000000ffffff9c
           RBP: 000055759a5fc910 R08: 0000000000000000 R09: 0000000000000001
           R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000800
           R13: 00007fff72413c90 R14: 000055759a5fc910 R15: 00007f6c408c43a8
            </TASK>

> > This happens because add_pin_to_irq_node() function would panic if
> > adding a pin to an IRQ failed due to -ENOMEM (which was injected by
> > failslab fault injector).  I've been running with this patch in my test
> > cases in order to be able to pick real bugs, and I thought it might be a
> > good idea to have it upstream also, so, other people trying to find real
> > bugs don't stumble upon this one. Also, this makes sense in a real
> > world(?), when retrying a few times might be better than just
> > panicking.
> 
> While it seems to make sense, the reality is that this is mostly early
> boot code. If there is a real world memory allocation failure during
> early boot then retries will not help at all.

This is not happening at early boot, this is reproducible when running
stress-ng in this aggressive mode.

Since I have failslab injecting a kmalloc fault,
__add_pin_to_irq_noder() returns -ENOMEM, which causes the undesired
panic().

> > Introduce a retry mechanism that attempts to add the pin up to 3 times
> > before giving up and panicking. This should improve the robustness of
> > the IO-APIC code in the face of transient errors.
> 
> I'm absolutely not convinced by this loop heuristic. That's just a bad
> hack.

I will not disagree with you here, but I need to use this patch in order
to be able t keep the system not panicking and stable while fault
injecting slab errors and trying to reproduce a real bug in the network
stack.

Thanks for the review,
--breno

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ