lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a0573057-8b93-f6f8-59eb-e8d30ac7035f@redhat.com>
Date:   Mon, 18 Sep 2023 15:10:05 +0200
From:   Hans de Goede <hdegoede@...hat.com>
To:     Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
        Stephen Boyd <swboyd@...omium.org>
Cc:     Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Mark Gross <markgross@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, patches@...ts.linux.dev,
        platform-driver-x86@...r.kernel.org,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        Prashant Malani <pmalani@...omium.org>
Subject: Re: [PATCH v4 2/4] platform/x86: intel_scu_ipc: Check status upon
 timeout in ipc_wait_for_interrupt()

Hi Ilpo,

On 9/15/23 15:49, Ilpo Järvinen wrote:
> On Wed, 13 Sep 2023, Stephen Boyd wrote:
> 
>> It's possible for the completion in ipc_wait_for_interrupt() to timeout,
>> simply because the interrupt was delayed in being processed. A timeout
>> in itself is not an error. This driver should check the status register
>> upon a timeout to ensure that scheduling or interrupt processing delays
>> don't affect the outcome of the IPC return value.
>>
>>  CPU0                                                   SCU
>>  ----                                                   ---
>>  ipc_wait_for_interrupt()
>>   wait_for_completion_timeout(&scu->cmd_complete)
>>   [TIMEOUT]                                             status[IPC_STATUS_BUSY]=0
>>
>> Fix this problem by reading the status bit in all cases, regardless of
>> the timeout. If the completion times out, we'll assume the problem was
>> that the IPC_STATUS_BUSY bit was still set, but if the status bit is
>> cleared in the meantime we know that we hit some scheduling delay and we
>> should just check the error bit.
> 
> Hi,
> 
> I don't understand the intent here. What prevents IPC_STATUS_BUSY from 
> changing right after you've read it in ipc_read_status(scu)? Doesn't that 
> end you exactly into the same situation where the returned value is stale 
> so I cannot see how this fixes anything, at best it just plays around the 
> race window that seems to still be there after this fix?

As I understand it the problem before was that the function would
return -ETIMEDOUT; purely based on wait_for_completion_timeout()
without ever actually checking the BUSY bit:

Old code:

	if (!wait_for_completion_timeout(&scu->cmd_complete, IPC_TIMEOUT))
		return -ETIMEDOUT;

This allows for a scenario where when the IRQ processing got delayed
(on say another core) causing the timeout to trigger,
ipc_wait_for_interrupt() would return -ETIMEDOUT even though
the BUSY flag was already cleared by the SCU.

This patch adds an explicit check for the BUSY flag after
the wait_for_completion(), rather then relying on the
wait_for_completion() return value which implies things
are still busy.

As for "What prevents IPC_STATUS_BUSY from 
changing right after you've read it in ipc_read_status(scu)?"

AFAICT in this code path the bit is only ever supposed to go
from being set (busy) to unset (not busy), not the other
way around since no new commands can be submitted until
this function has completed. So that scenario cannot happen.

Regards,

Hans

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ