lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4981d7eb-b41e-c597-04ff-3d3295804d5a@nvidia.com>
Date:   Tue, 28 Apr 2020 09:01:56 +0100
From:   Jon Hunter <jonathanh@...dia.com>
To:     Dmitry Osipenko <digetx@...il.com>,
        Thierry Reding <thierry.reding@...il.com>
CC:     Wolfram Sang <wsa@...-dreams.de>,
        Laxman Dewangan <ldewangan@...dia.com>,
        Manikanta Maddireddy <mmaddireddy@...dia.com>,
        Vidya Sagar <vidyas@...dia.com>, <linux-i2c@...r.kernel.org>,
        <linux-tegra@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/2] i2c: tegra: Better handle case where CPU0 is busy
 for a long time


On 27/04/2020 16:18, Dmitry Osipenko wrote:
> 27.04.2020 18:12, Thierry Reding пишет:
>> On Mon, Apr 27, 2020 at 05:21:30PM +0300, Dmitry Osipenko wrote:
>>> 27.04.2020 14:00, Thierry Reding пишет:
>>>> On Mon, Apr 27, 2020 at 12:52:10PM +0300, Dmitry Osipenko wrote:
>>>>> 27.04.2020 10:48, Thierry Reding пишет:
>>>>> ...
>>>>>>> Maybe but all these other problems appear to have existed for sometime
>>>>>>> now. We need to fix all, but for the moment we need to figure out what's
>>>>>>> best for v5.7.
>>>>>>
>>>>>> To me it doesn't sound like we have a good handle on what exactly is
>>>>>> going on here and we're mostly just poking around.
>>>>>>
>>>>>> And even if things weren't working quite properly before, it sounds to
>>>>>> me like this patch actually made things worse.
>>>>>
>>>>> There is a plenty of time to work on the proper fix now. To me it sounds
>>>>> like you're giving up on fixing the root of the problem, sorry.
>>>>
>>>> We're at -rc3 now and I haven't seen any promising progress in the last
>>>> week. All the while suspend/resume is now broken on at least one board
>>>> and that may end up hiding any other issues that could creep in in the
>>>> meantime.
>>>>
>>>> Furthermore we seem to have a preexisting issue that may very well
>>>> interfere with this patch, so I think the cautious thing is to revert
>>>> for now and then fix the original issue first. We can always come back
>>>> to this once everything is back to normal.
>>>>
>>>> Also, people are now looking at backporting this to v5.6. Unless we
>>>> revert this from v5.7 it may get picked up for backports to other
>>>> kernels and then I have to notify stable kernel maintainers that they
>>>> shouldn't and they have to back things out again. That's going to cause
>>>> a lot of wasted time for a lot of people.
>>>>
>>>> So, sorry, I disagree. I don't think we have "plenty of time".
>>>
>>> There is about a month now before the 5.7 release. It's a bit too early
>>> to start the panic, IMO :)
>>
>> There's no panic. A patch got merged and it broken something, so we
>> revert it and try again. It's very much standard procedure.
>>
>>> Jon already proposed a reasonable simple solution: to keep PCIe
>>> regulators always-ON. In a longer run we may want to have I2C atomic
>>> transfers supported for a late suspend phase.
>>
>> That's not really a solution, though, is it? It's just papering over
>> an issue that this patch introduced or uncovered. I'm much more in
>> favour of fixing problems at the root rather than keep papering over
>> until we loose track of what the actual problems are.
> 
> It's not "papering over an issue". The bug can't be fixed properly
> without introducing I2C atomic transfers support for a late suspend
> phase, I don't see any other solutions for now. Stable kernels do not
> support atomic transfers at all, that proper solution won't be backportable.


There are a few issues here, but the issue Thierry and I are referring
to is the regression introduced by this change. Yes this exposes other
problems, but we first need to understand why this breaks resume in
general, regardless of what the PCIe driver is doing. I will look at
this a bit more later this week.

Cheers
Jon

-- 
nvpublic

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ