netdev - Re: PTP vclock: BUG: scheduling while atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c611562b-f767-ceed-3227-82c89ff4cc52@intel.com>
Date:   Thu, 2 Feb 2023 12:52:34 -0800
From:   Jacob Keller <jacob.e.keller@...el.com>
To:     Miroslav Lichvar <mlichvar@...hat.com>,
        Íñigo Huguet <ihuguet@...hat.com>
CC:     <netdev@...r.kernel.org>, <richardcochran@...il.com>,
        <yangbo.lu@....com>, <gerhard@...leder-embedded.com>,
        <habetsm.xilinx@...il.com>, <ecree.xilinx@...il.com>,
        <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
        <pabeni@...hat.com>, <alex.maftei@....com>
Subject: Re: PTP vclock: BUG: scheduling while atomic

On 2/2/2023 8:33 AM, Miroslav Lichvar wrote:
> On Thu, Feb 02, 2023 at 05:02:07PM +0100, Íñigo Huguet wrote:
>> Our QA team was testing PTP vclocks, and they've found this error with sfc NIC/driver:
>>   BUG: scheduling while atomic: ptp5/25223/0x00000002
>>
>> The reason seems to be that vclocks disable interrupts with `spin_lock_irqsave` in
>> `ptp_vclock_gettime`, and then read the timecounter, which in turns ends calling to
>> the driver's `gettime64` callback.
> 
> The same issue was observed with the ice driver:
> https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20221107/030633.html
> 
> I tried to fix it generally in the vclock support, but was not
> successful. There was a hint it would be fixed in the driver. I'm not
> sure what is the best approach here.
> 

This slipped through the cracks. The root cause (for ice) is that the
.gettime callback might sleep while waiting for the HW semaphore registers.

We had a change that fixed this (though we had done it for other
reasons) by simply not blocking gettime access with the semaphore, but
Richard didn't like this approach and NAK'd the patch on netdev:

https://lore.kernel.org/intel-wired-lan/877d0yt0ns.fsf@intel.com/

Alternatives are challenging here as the semaphore is used across
multiple PFs which makes using a spinlock difficult, as the PFs don't
share any references.

We could switch the part that does usleep to udelay instead, so we
wouldn't cause this scheduling bug... not sure if that has any other
side effects.

I'm not sure if there's another way to drop the semaphore and assuage
concerns over correctness. We need to read the time registers and its
possible another thread (or in principle another PF) is modifying the
time. We only expose a single PTP clock device, but all PFs can access
the time for the purpose of caching value used for extending 40bit
timestamps.