lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPv3WKewWHd=23MKar8_-B4YpYQbnX9fqqPH=Ti7aGe2rV6FuQ@mail.gmail.com>
Date:   Wed, 16 Feb 2022 14:32:42 +0100
From:   Marcin Wojtas <mw@...ihalf.com>
To:     Marc Zyngier <maz@...nel.org>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        John Garry <john.garry@...wei.com>, kernel-team@...roid.com
Subject: Re: [PATCH 0/2] net: mvpp2: Survive CPU hotplug events

śr., 16 lut 2022 o 14:29 Marc Zyngier <maz@...nel.org> napisał(a):
>
> On Wed, 16 Feb 2022 13:19:30 +0000,
> Marcin Wojtas <mw@...ihalf.com> wrote:
> >
> > Hi Marc,
> >
> > śr., 16 lut 2022 o 10:08 Marc Zyngier <maz@...nel.org> napisał(a):
> > >
> > > I recently realised that playing with CPU hotplug on a system equiped
> > > with a set of MVPP2 devices (Marvell 8040) was fraught with danger and
> > > would result in a rapid lockup or panic.
> > >
> > > As it turns out, the per-CPU nature of the MVPP2 interrupts are
> > > getting in the way. A good solution for this seems to rely on the
> > > kernel's managed interrupt approach, where the core kernel will not
> > > move interrupts around as the CPUs for down, but will simply disable
> > > the corresponding interrupt.
> > >
> > > Converting the driver to this requires a bit of refactoring in the IRQ
> > > subsystem to expose the required primitive, as well as a bit of
> > > surgery in the driver itself.
> > >
> > > Note that although the system now survives such event, the driver
> > > seems to assume that all queues are always active and doesn't inform
> > > the device that a CPU has gone away. Someout who actually understand
> > > this driver should have a look at it.
> > >
> > > Patches on top of 5.17-rc3, lightly tested on a McBin.
> > >
> >
> > Thank you for the patches. Can you, please, share the commands you
> > used? I'd like to test it more.
>
> Offline CPU3:
> # echo 0 > /sys/devices/system/cpu/cpu3/online
>
> Online CPU3:
> # echo 1 > /sys/devices/system/cpu/cpu3/online
>
> Put that in a loop, using different CPUs.
>
> On my HW, turning off CPU0 leads to odd behaviours (I wouldn't be
> surprised if the firmware was broken in that respect, and also the
> fact that the device keeps trying to send stuff to that CPU...).
>

Thanks, I think stressing DUT with traffic during CPU hotplug will be
a good scenario - I'll try that.

Marcin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ