lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 20 Oct 2017 16:59:44 +0200 (CEST)
From:   David Kozub <zub@...ux.fjfi.cvut.cz>
To:     Daniel Lezcano <daniel.lezcano@...aro.org>
cc:     Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious
 interrupts

On Fri, 20 Oct 2017, Daniel Lezcano wrote:

> On 20/10/2017 09:49, David Kozub wrote:
>> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
>>
>>> On 20/10/2017 00:25, Thomas Gleixner wrote:
>>>> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
>>>>
>>>>> On 19/10/2017 22:57, David Kozub wrote:
>>>>>> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
>>>>>> clockevents_config_and_register returns. This caused mfgpt_tick to
>>>>>> call a
>>>>>> null function pointer.
>>>>>>
>>>>>> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze
>>>>>> this
>>>>>> and suggesting a solution.
>>>>>>
>>>>>> Suggested-by: Thomas Gleixner <tglx@...utronix.de>
>>>>>> Signed-off-by: David Kozub <zub@...ux.fjfi.cvut.cz>
>>>>>> ---
>>>>>
>>>>> Thank for sending this fix.
>>>>>
>>>>> Can you check if the commit 8f9327cbb is the one introducing 
the
>>>>> regression ? So we can add the proper tags and propagate the fix to
>>>>> stable.
>>>>
>>>> No it's not.
>>>>
>>>> -       if (cs5535_tick_mode == CLOCK_EVT_MODE_SHUTDOWN)
>>>> +       if (clockevent_state_shutdown(&cs5535_clockevent))
>>>>
>>>> This particular problem of the missing detached state check has been
>>>> there
>>>> forever and went unnoticed for whatever reason.
>>>
>>> The detached condition was artificially caught by the initialized
>>> variable:
>>>
>>> -static unsigned int cs5535_tick_mode = CLOCK_EVT_MODE_SHUTDOWN;
>>>
>>> The patch 8f9327cbb removes the variable, so very likely this is where
>>> the problem appeared.
>>
>> I will try to test that. But I won't have access to the device till
>> Sunday evening. I've had big trouble trying to run kernels > 4.1-rc5 on
>> the device and if I'm looking correctly the commit was introduced in
>> 4.3-rc1. But I'll try to figure something out.
>
> David,
>
> thanks again for taking the time to report and investigate this issue.
> Usually people just give up and drop the legacy hardware without letting
> us know the kernel is broken with it. So don't spend too much time with
> it, just check if the commit before works, if not, just add in the log
> the kernel version you noticed the breakage.
>
> In case you are interested in doing some debugging to narrow down the
> offending commit and you know the versions working and not working, you
> can try the command git-bisect. It will use a dichotomy approach to find
> out the culprit.

Hi Daniel,

I originally didn't see this issue simply because 4.1-rc7 would not output 
anything on the serial on the device. For kernels >= 4.1-rc7 the kernel 
either reboots immediatelly or freezes, in both cases without giving any 
output in the serial. I tried to find the cause of this with git bisect 
but in the end I was none the wiser. I also suspected my build 
environment.

Eventually I found out the following:
* 4.1-rc6 boots OK
* 4.1-rc7 restarts immediatelly, but if I revert f18c34e48 it works but I 
have no idea why would this commit break anything, the commit looks OK
* 4.1 - works if I revert f18c34e48
* e75c73ad6 feeezes even after reverting f18c34e48
* 4.2-rc1 - reboots, even after reverting f18c34e48

Only recently I tried a current kernel and I was more lucky: it gave me 
some useful output on the serial.

To verify 8f9327cbb I was thinking I could take it and apply it onto 
4.1-rc6 (if that was possible), but as Thomas suggested it might not be 
worth it, then I forget about that.

Thanks for all the help to you and to Thomas.

Best regards,
David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ