lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <817a2689-15df-f2e4-3b3d-e5831fba7fa3@gmail.com>
Date:   Wed, 10 May 2017 09:21:43 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Phil Elwell <phil@...pberrypi.org>,
        Marc Zyngier <marc.zyngier@....com>
Cc:     Eric Anholt <eric@...olt.net>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jason Cooper <jason@...edaemon.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Ray Jui <rjui@...adcom.com>,
        Scott Branden <sbranden@...adcom.com>,
        bcm-kernel-feedback-list@...adcom.com,
        linux-kernel@...r.kernel.org, linux-rpi-kernel@...ts.infradead.org
Subject: Re: [PATCH] irq_bcm2836: Send event when onlining sleeping cores

On 05/10/2017 03:31 AM, Phil Elwell wrote:
> On 10/05/2017 11:09, Marc Zyngier wrote:
>> On 10/05/17 10:05, Phil Elwell wrote:
>>> On 10/05/2017 09:55, Marc Zyngier wrote:
>>>> On Wed, May 10 2017 at  9:27:10 am BST, Phil Elwell <phil@...pberrypi.org> wrote:
>>>>> On 10/05/2017 08:42, Marc Zyngier wrote:
>>>>>> On 09/05/17 20:02, Phil Elwell wrote:
>>>>>>> On 09/05/2017 19:53, Marc Zyngier wrote:
>>>>>>>> On 09/05/17 19:52, Phil Elwell wrote:
>>>>>>>>> On 09/05/2017 19:14, Marc Zyngier wrote:
>>>>>>>>>> On 09/05/17 19:08, Eric Anholt wrote:
>>>>>>>>>>> Marc Zyngier <marc.zyngier@....com> writes:
>>>>>>>>>>>
>>>>>>>>>>>> On 09/05/17 17:59, Eric Anholt wrote:
>>>>>>>>>>>>> Phil Elwell <phil@...pberrypi.org> writes:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> In order to reduce power consumption and bus traffic, it is sensible
>>>>>>>>>>>>>> for secondary cores to enter a low-power idle state when waiting to
>>>>>>>>>>>>>> be started. The wfe instruction causes a core to wait until an event
>>>>>>>>>>>>>> or interrupt arrives before continuing to the next instruction.
>>>>>>>>>>>>>> The sev instruction sends a wakeup event to the other cores, so call
>>>>>>>>>>>>>> it from bcm2836_smp_boot_secondary, the function that wakes up the
>>>>>>>>>>>>>> waiting cores during booting.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is harmless to use this patch without the corresponding change
>>>>>>>>>>>>>> adding wfe to the ARMv7/ARMv8-32 stubs, but if the stubs are updated
>>>>>>>>>>>>>> and this patch is not applied then the other cores will sleep forever.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> See: https://github.com/raspberrypi/linux/issues/1989
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Phil Elwell <phil@...pberrypi.org>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>  drivers/irqchip/irq-bcm2836.c | 3 +++
>>>>>>>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/irqchip/irq-bcm2836.c b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>>> index e10597c..6dccdf9 100644
>>>>>>>>>>>>>> --- a/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>>> +++ b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>>> @@ -248,6 +248,9 @@ static int __init bcm2836_smp_boot_secondary(unsigned int cpu,
>>>>>>>>>>>>>>  	writel(secondary_startup_phys,
>>>>>>>>>>>>>>  	       intc.base + LOCAL_MAILBOX3_SET0 + 16 * cpu);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +	dsb(sy); /* Ensure write has completed before waking the other CPUs */
>>>>>>>>>>>>>> +	sev();
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>  	return 0;
>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is also the behavior that the standard arm64 spin-table
>>>>>>>>>>>>> method has,
>>>>>>>>>>>>> which we unfortunately can't quite use.
>>>>>>>>>>>>
>>>>>>>>>>>> And why is that so? Why do you have to reinvent the wheel (and hide the
>>>>>>>>>>>> cloned wheel in an interrupt controller driver)?
>>>>>>>>>>>>
>>>>>>>>>>>> That doesn't seem right to me.
>>>>>>>>>>>
>>>>>>>>>>> The armv8 stubs (firmware-supplied code in the low page that do the
>>>>>>>>>>> spinning) do actually implement arm64's spin-table method.  It's the
>>>>>>>>>>> armv7 stubs that use these registers in the irqchip instead of plain
>>>>>>>>>>> addresses in system memory.
>>>>>>>>>>
>>>>>>>>>> Let's put ARMv7 aside for the time being. If your firmware already
>>>>>>>>>> implements spin-tables, why don't you simply use that at least on arm64?
>>>>>>>>>
>>>>>>>>> We do.
>>>>>>>>
>>>>>>>> Obviously not the way it is intended if you have to duplicate the core
>>>>>>>> architectural code in the interrupt controller driver, which couldn't
>>>>>>>> care less.
>>>>>>>
>>>>>>> If we were using this method on arm64 then the other cores would not start up
>>>>>>> because armstub8.S has always included a wfe. Nothing in the commit mentions
>>>>>>> arm64 - this is an ARCH=arm fix.
>>>>>>
>>>>>> Thanks for the clarification, which you could have added to the commit
>>>>>> message.
>>>>>>
>>>>>> The question still remains: why do we have CPU bring-up code in an
>>>>>> interrupt controller, instead of having it in the architecture code?
>>>>>>
>>>>>> The RPi-2 is the *only* platform to have its SMP bringup code outside of
>>>>>> arch/arm, so the first course of action would be to move that code where
>>>>>> it belongs.
>>>>>
>>>>> You were CC'd on the commit (41f4988cc287e5f836d3f6620c9f900bc9b560e9) that
>>>>> introduced bcm2836_smp_boot_secondary - it seems strange to start objecting
>>>>> now.
>>>>
>>>> Well, I'm far from being perfect. If I had noticed it, I'd have NACKed
>>>> it.
>>>>
>>>>> Yes, I think it is odd that it didn't go into arch/arm/mach-bcm, but in
>>>>> the interests of making changes in small, independent steps, do you have a
>>>>> problem with this commit?
>>>>
>>>> On its own, no. I'm just not keen on adding more unrelated stuff to this
>>>> file, so let's start with dealing with the original bug, and you can
>>>> then add this fix on top.
>>>
>>> That's an interesting use of the word "bug". From Wikipedia:
>>>
>>> "A software bug is an error, flaw, failure or fault in a computer program or
>>> system that causes it to produce an incorrect or unexpected result, or to
>>> behave in unintended ways."
>>
>> Whatever. Should I call it "pile of crap dumped in unsuitable locations"
>> instead? What does Wikipedia says about it?
>>
>>> Although your concerns are valid, the faults you are objecting to are not causing
>>> a malfunction of any kind. If we were to update the RPi firmware before this
>>> patch was merged then upstream users would be left with one wheel on their wagon.
>>
>> And that'd be your problem, not mine. Look, you can argue around this
>> all day, or you can fix this mess. Your choice.
> 
> Is that the opinion of all here?

The choice of word here got largely out of the original topic and I
surely did eat a ton of popcorn here. There are two things that need
fixing, and the time line and process for fixing these is clear:

- your bugfix (Phil) is something that should be applied now, and
backported to -stable trees once the fix hits the irqchip tree (or Linus')

- relocating the code that does the secondary boot out of
drivers/irqchip/ into arch/arm/mach-bcm/ needs to happen (Marc), and
this is 4.13 material, there is no urgency in doing this *right now*,
but it needs to happen

Does that work for everyone?
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ