netdev - Re: [PATCH net-next 08/14] ice: protect init and calibrating fields with spinlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba949af0-7de6-ab12-6501-46a5af06001f@intel.com>
Date:   Fri, 2 Dec 2022 10:36:40 -0800
From:   Jacob Keller <jacob.e.keller@...el.com>
To:     Leon Romanovsky <leon@...nel.org>,
        Tony Nguyen <anthony.l.nguyen@...el.com>
CC:     <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
        <edumazet@...gle.com>, <netdev@...r.kernel.org>,
        <richardcochran@...il.com>, Gurucharan G <gurucharanx.g@...el.com>
Subject: Re: [PATCH net-next 08/14] ice: protect init and calibrating fields
 with spinlock



On 12/1/2022 1:18 AM, Leon Romanovsky wrote:
> On Wed, Nov 30, 2022 at 11:43:24AM -0800, Tony Nguyen wrote:
>> From: Jacob Keller <jacob.e.keller@...el.com>
>>
>> Ensure that the init and calibrating fields of the PTP Tx timestamp tracker
>> structure are only modified under the spin lock. This ensures that the
>> accesses are consistent and that new timestamp requests will either begin
>> completely or get ignored.
>>
>> Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
>> Tested-by: Gurucharan G <gurucharanx.g@...el.com> (A Contingent worker at Intel)
>> Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
>> ---
>>   drivers/net/ethernet/intel/ice/ice_ptp.c | 55 ++++++++++++++++++++++--
>>   drivers/net/ethernet/intel/ice/ice_ptp.h |  2 +-
>>   2 files changed, 52 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
>> index a7d950dd1264..0e39fed7cfca 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
>> @@ -599,6 +599,42 @@ static u64 ice_ptp_extend_40b_ts(struct ice_pf *pf, u64 in_tstamp)
>>   				     (in_tstamp >> 8) & mask);
>>   }
>>   
>> +/**
>> + * ice_ptp_is_tx_tracker_init - Check if the Tx tracker is initialized
>> + * @tx: the PTP Tx timestamp tracker to check
>> + *
>> + * Check that a given PTP Tx timestamp tracker is initialized. Acquires the
>> + * tx->lock spinlock.
>> + */
>> +static bool
>> +ice_ptp_is_tx_tracker_init(struct ice_ptp_tx *tx)
>> +{
>> +	bool init;
>> +
>> +	spin_lock(&tx->lock);
>> +	init = tx->init;
>> +	spin_unlock(&tx->lock);
>> +
>> +	return init;
> 
> How this type of locking can be correct?
> It doesn't protect anything and equal to do not have locking at all.
> 
> Thanks

The init field is used to by the Tx timestamp work item to have it exit 
if it is executed after the Tx tracker starts being de-initialized. I 
guess technically reading the value would be atomic.. Though using a 
spinlock also ensures the appropriate memory barriers are in place 
around reading the value, preventing re-ordering.

We clear the init field and then the Tx timestamp thread will stop 
re-arming itself. We used to have a kthread item which we would then 
call flush to ensure the last queued instance of the work item was 
removed... but ugh now that we're using a threaded interrupt I am not 
sure if we have any way to guarantee that the Tx work has completed. I 
am not sure how to address that now with a threaded interrupt. Hmm.

We can't hold a spin lock for the entire duration of the Tx timestamp 
work because it might sleep while checking for an outstanding Tx 
timestamp via the device Admin queue interface to read the PHY timestamp.

The goal is to ensure that

a) timestamp requests guarantee that they each get unique indexes or 
fail to start. This is covered by the fact that timestamp requests all 
hold the lock over the duration of checking that requests are being 
accepted, picking the index, and setting that index's in_use bit

b) timestamp completions are reported only once. The only place that 
processes timestamp completions is the threaded interrupt function, 
which can only have one instance executing at a time. This function 
doesn't hold the spin lock while iterating the in_use bitmap, but it 
does hold the lock when it reports the timestamp and before doing so it 
rechecks the in_use bit in case the timestamp got flushed. We used to 
have a separate thread that handled discarding old timestamps after 2 
seconds but we now do that from within the same threaded function. Its 
possible that a request thread might execute while this function is 
processing and *set* a new in_use bit. In this case, either the function 
will see the update when iterating the in_use bitmap, in which case it 
will check the PHY register and process the timestamp if it happened to 
be completed. Or, it will miss the timestamp in that run. At the end of 
the loop, the lock is re-acquired and we check if any timestamp requests 
are outstanding. If so, we exit such that the threaded interrupt 
function will re-execute the Tx timestamp thread shortly to check again.

c) In previous versions we had a kthread task which was used to run the 
Tx timestamp function. In that version we used kthread flush after 
clearing the init flag to ensure that the threaded function was 
completed. I think this is a gap since the commit that introduced a 
threaded interrupt instead. We need some way for the tear down to ensure 
that no more timestamp processing will occur. There is a small window 
after we clear init that we could race with tearing down and removing 
the Tx tracker memory now :(

I am not sure what the best way to fix c) is. Perhaps an additional flag 
of some sort which indicates that the timestamp thread function is 
processing so that tear down can wait until after the interrupt function 
completes. Once init is cleared, the function will stop re-executing, 
but we need a way to wait until it has stopped. That method in tear down 
can't hold the lock or else we'd potentially deadlock... Maybe an 
additional "processing" flag which is set under lock only if the init 
flag is set? and then cleared when the function exits. Then the tear 
down can check and wait for the processing to be complete? Hmm.

I realize this whole scheme is rather complicated. The biggest problem 
is that while reading timestamps we need to interact with firmware over 
the Admin queue, but we also need to safely be able to set new timestamp 
requests from the hot path with minimal disruption. If we locked over 
the entire Tx timestamp read process it would block hot path.

I think there's a gap now with the threaded interrupt needing a way to 
ensure that no new Tx timestamp interrupts will be processed since we 
can no longer use kthread flushing to handle that. I believe the other 
parts are correct.

Thanks,
Jake