lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 May 2023 15:24:45 +0800
From:   Yujie Liu <yujie.liu@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     Shanker Donthineni <sdonthineni@...dia.com>,
        <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>, Marc Zyngier <maz@...nel.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        "Michael Walle" <michael@...le.cc>,
        Vikram Sethi <vsethi@...dia.com>
Subject: Re: [PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors
 management

Hi Thomas,

On Mon, May 08, 2023 at 11:36:37AM +0200, Thomas Gleixner wrote:
> >> Under the assumption that the code is correct, then the effect of this
> >> patch is that it changes the timing. Sigh.
> >> 
> >>   1) Does this happen with a 64-bit kernel too?
> >
> > It doesn't happen on a 64-bit kernel:
> 
> Ok. So one difference might be that a 64 bit kernel enables interrupt
> rempping. Can you add 'intremap=off' to the kernel command line please?

Sorry, my previous info was incorrect.

The block/008 (do IO while hotplugging CPUs) failure also happens on a
64-bit kernel no matter having 'intremap=off' or not, and persists when
tested against v6.3, but the warning in default_send_IPI_mask_logical
function is not triggered on a 64-bit kernel. Not sure if that function
is 32-bit specific since it is set in arch/x86/kernel/apic/probe_32.c.

== x86_64 kernel ==

compiler/disk/kconfig/rootfs/tbox_group/test/testcase:
  gcc-11/1SSD/x86_64-rhel-8.3-func/debian-11.1-x86_64-20220510.cgz/lkp-skl-d06/block-group-00/blktests

commit:
  v6.3
  32c58fc685e5c ("genirq: Use the maple tree for IRQ descriptors management")

            v6.3 32c58fc685e5cd6b5947a5f8e9a
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :10          70%           7:7     blktests.block/008.fail
           :10          70%           7:7     blktests.block/012.fail

== i386 kernel ==

compiler/disk/kconfig/rootfs/tbox_group/test/testcase:
  gcc-11/1SSD/i386-debian-10.3-func/debian-11.1-i386-20220923.cgz/lkp-skl-d06/block-group-00/blktests

commit:
  v6.3
  32c58fc685e5c ("genirq: Use the maple tree for IRQ descriptors management")

            v6.3 32c58fc685e5cd6b5947a5f8e9a
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :20          90%          18:49    blktests.block/008.fail
           :20          90%          18:49    blktests.block/012.fail
           :20          80%          16:49    dmesg.EIP:default_send_IPI_mask_logical
           :20          80%          16:49    dmesg.WARNING:at_arch/x86/kernel/apic/ipi.c:#default_send_IPI_mask_logical

> >>   2) Can you enable the irq_vector:vector_*.* tracepoints and provide
> >>      the trace?
> >
> > Nothing was written to trace buffer, seems like no irq_vector events
> > were captured during this test.
> 
> Can you please apply the patch below? No need to enable the irq_vector
> events. It just dumps the information into dmesg.

The dmesgs of 64-bit and 32-bit kernels are attached.

--
Best Regards,
Yujie

> ---
> --- a/kernel/irq/cpuhotplug.c
> +++ b/kernel/irq/cpuhotplug.c
> @@ -57,7 +57,8 @@ static bool migrate_one_irq(struct irq_d
>  	bool maskchip = !irq_can_move_pcntxt(d) && !irqd_irq_masked(d);
>  	const struct cpumask *affinity;
>  	bool brokeaff = false;
> -	int err;
> +	int err, irq = d->irq;
> +	bool move_pending;
>  
>  	/*
>  	 * IRQ chip might be already torn down, but the irq descriptor is
> @@ -101,10 +102,16 @@ static bool migrate_one_irq(struct irq_d
>  	 * there is no move pending or the pending mask does not contain
>  	 * any online CPU, use the current affinity mask.
>  	 */
> -	if (irq_fixup_move_pending(desc, true))
> +	move_pending = irqd_is_setaffinity_pending(d);
> +	if (irq_fixup_move_pending(desc, true)) {
>  		affinity = irq_desc_get_pending_mask(desc);
> -	else
> +		pr_info("IRQ %3d: move_pending=%d pending mask: %*pbl\n",
> +			irq, move_pending, cpumask_pr_args(affinity));
> +	} else {
>  		affinity = irq_data_get_affinity_mask(d);
> +		pr_info("IRQ %3d: move_pending=%d affinity mask: %*pbl\n",
> +			irq, move_pending, cpumask_pr_args(affinity));
> +	}
>  
>  	/* Mask the chip for interrupts which cannot move in process context */
>  	if (maskchip && chip->irq_mask)
> @@ -136,6 +143,9 @@ static bool migrate_one_irq(struct irq_d
>  		brokeaff = false;
>  	}
>  
> +	affinity = irq_data_get_effective_affinity_mask(d);
> +	pr_info("IRQ %3d: Done: %*pbl\n", irq, cpumask_pr_args(affinity));
> +
>  	if (maskchip && chip->irq_unmask)
>  		chip->irq_unmask(d);
>  

Download attachment "dmesg_i386.xz" of type "application/x-xz" (42560 bytes)

Download attachment "dmesg_x86_64.xz" of type "application/x-xz" (75076 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ