[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba949af0-7de6-ab12-6501-46a5af06001f@intel.com>
Date: Fri, 2 Dec 2022 10:36:40 -0800
From: Jacob Keller <jacob.e.keller@...el.com>
To: Leon Romanovsky <leon@...nel.org>,
Tony Nguyen <anthony.l.nguyen@...el.com>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
<edumazet@...gle.com>, <netdev@...r.kernel.org>,
<richardcochran@...il.com>, Gurucharan G <gurucharanx.g@...el.com>
Subject: Re: [PATCH net-next 08/14] ice: protect init and calibrating fields
with spinlock
On 12/1/2022 1:18 AM, Leon Romanovsky wrote:
> On Wed, Nov 30, 2022 at 11:43:24AM -0800, Tony Nguyen wrote:
>> From: Jacob Keller <jacob.e.keller@...el.com>
>>
>> Ensure that the init and calibrating fields of the PTP Tx timestamp tracker
>> structure are only modified under the spin lock. This ensures that the
>> accesses are consistent and that new timestamp requests will either begin
>> completely or get ignored.
>>
>> Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
>> Tested-by: Gurucharan G <gurucharanx.g@...el.com> (A Contingent worker at Intel)
>> Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
>> ---
>> drivers/net/ethernet/intel/ice/ice_ptp.c | 55 ++++++++++++++++++++++--
>> drivers/net/ethernet/intel/ice/ice_ptp.h | 2 +-
>> 2 files changed, 52 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
>> index a7d950dd1264..0e39fed7cfca 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
>> @@ -599,6 +599,42 @@ static u64 ice_ptp_extend_40b_ts(struct ice_pf *pf, u64 in_tstamp)
>> (in_tstamp >> 8) & mask);
>> }
>>
>> +/**
>> + * ice_ptp_is_tx_tracker_init - Check if the Tx tracker is initialized
>> + * @tx: the PTP Tx timestamp tracker to check
>> + *
>> + * Check that a given PTP Tx timestamp tracker is initialized. Acquires the
>> + * tx->lock spinlock.
>> + */
>> +static bool
>> +ice_ptp_is_tx_tracker_init(struct ice_ptp_tx *tx)
>> +{
>> + bool init;
>> +
>> + spin_lock(&tx->lock);
>> + init = tx->init;
>> + spin_unlock(&tx->lock);
>> +
>> + return init;
>
> How this type of locking can be correct?
> It doesn't protect anything and equal to do not have locking at all.
>
> Thanks
The init field is used to by the Tx timestamp work item to have it exit
if it is executed after the Tx tracker starts being de-initialized. I
guess technically reading the value would be atomic.. Though using a
spinlock also ensures the appropriate memory barriers are in place
around reading the value, preventing re-ordering.
We clear the init field and then the Tx timestamp thread will stop
re-arming itself. We used to have a kthread item which we would then
call flush to ensure the last queued instance of the work item was
removed... but ugh now that we're using a threaded interrupt I am not
sure if we have any way to guarantee that the Tx work has completed. I
am not sure how to address that now with a threaded interrupt. Hmm.
We can't hold a spin lock for the entire duration of the Tx timestamp
work because it might sleep while checking for an outstanding Tx
timestamp via the device Admin queue interface to read the PHY timestamp.
The goal is to ensure that
a) timestamp requests guarantee that they each get unique indexes or
fail to start. This is covered by the fact that timestamp requests all
hold the lock over the duration of checking that requests are being
accepted, picking the index, and setting that index's in_use bit
b) timestamp completions are reported only once. The only place that
processes timestamp completions is the threaded interrupt function,
which can only have one instance executing at a time. This function
doesn't hold the spin lock while iterating the in_use bitmap, but it
does hold the lock when it reports the timestamp and before doing so it
rechecks the in_use bit in case the timestamp got flushed. We used to
have a separate thread that handled discarding old timestamps after 2
seconds but we now do that from within the same threaded function. Its
possible that a request thread might execute while this function is
processing and *set* a new in_use bit. In this case, either the function
will see the update when iterating the in_use bitmap, in which case it
will check the PHY register and process the timestamp if it happened to
be completed. Or, it will miss the timestamp in that run. At the end of
the loop, the lock is re-acquired and we check if any timestamp requests
are outstanding. If so, we exit such that the threaded interrupt
function will re-execute the Tx timestamp thread shortly to check again.
c) In previous versions we had a kthread task which was used to run the
Tx timestamp function. In that version we used kthread flush after
clearing the init flag to ensure that the threaded function was
completed. I think this is a gap since the commit that introduced a
threaded interrupt instead. We need some way for the tear down to ensure
that no more timestamp processing will occur. There is a small window
after we clear init that we could race with tearing down and removing
the Tx tracker memory now :(
I am not sure what the best way to fix c) is. Perhaps an additional flag
of some sort which indicates that the timestamp thread function is
processing so that tear down can wait until after the interrupt function
completes. Once init is cleared, the function will stop re-executing,
but we need a way to wait until it has stopped. That method in tear down
can't hold the lock or else we'd potentially deadlock... Maybe an
additional "processing" flag which is set under lock only if the init
flag is set? and then cleared when the function exits. Then the tear
down can check and wait for the processing to be complete? Hmm.
I realize this whole scheme is rather complicated. The biggest problem
is that while reading timestamps we need to interact with firmware over
the Admin queue, but we also need to safely be able to set new timestamp
requests from the hot path with minimal disruption. If we locked over
the entire Tx timestamp read process it would block hot path.
I think there's a gap now with the threaded interrupt needing a way to
ensure that no new Tx timestamp interrupts will be processed since we
can no longer use kthread flushing to handle that. I believe the other
parts are correct.
Thanks,
Jake
Powered by blists - more mailing lists