[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c3dd8050-3d56-47b3-81a5-a4979ccf8bd9@linux.intel.com>
Date: Fri, 29 Dec 2023 10:15:37 +0800
From: "Abdul Rahim, Faizal" <faizal.abdul.rahim@...ux.intel.com>
To: Vladimir Oltean <vladimir.oltean@....com>
Cc: Vinicius Costa Gomes <vinicius.gomes@...el.com>,
Jamal Hadi Salim <jhs@...atatu.com>, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>, "David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 net 0/4] qbv cycle time extension/truncation
Hi Vladimir,
Sorry for the late reply, was on leave.
On 21/12/2023 9:35 pm, Vladimir Oltean wrote:
> (sorry, I started writing this email yesterday, I noticed the
> conversation continued with Paolo)
>
> On Wed, Dec 20, 2023 at 11:25:09AM +0800, Abdul Rahim, Faizal wrote:
>> Hi Vladimir,
>>
>> No worries, I truly appreciate the time you took to review and reply.
>>
>> What prompted this in general is related to my project requirement to enable
>> software QBV cycle time extension, so there's a validation team that created
>> test cases to properly validate cycle time extension. Then I noticed the
>> code doesn't handle truncation properly also, since it's the same code area,
>> I just fixed it together.
>
> We tend to do patch triage between 'net' and 'net-next' based on the
> balance between the urgency/impact of the fix and its complexity.
>
> While it's undoubtable that there are issues with taprio's handling of
> dynamic schedules, you've mentioned yourself that you only hit those
> issues as part of some new development work - they weren't noticed by
> end users. And fixing them is not quite trivial, there are also FIXMEs
> in taprio which suggest so. I'm worried that the fixes may also impact
> the code from stable trees in unforeseen ways.
>
> So I would recommend moving the development of these fixes to 'net-next',
> if possible.
Got it, will move it to net-next.
>> Each time before sending the patch for upstream review, I normally will run
>> our test cases that only validates cycle time extension. For truncation, I
>> modify the test cases on my own and put logs to check if the
>> cycle_time_correction negative value is within the correct range. I probably
>> should have mentioned sooner that I have tested this myself, sorry about
>> that.
>>
>> Example of the test I run for cycle time extension:
>> 1) 2 boards connected back-to-back with i226 NIC. Board A as sender, Board B
>> as receiver
>> 2) Time is sync between 2 boards with phc2sys and ptp4l
>> 3) Run GCL1 on Board A with cycle time extension enabled:
>> tc qdisc replace dev $INTERFACE parent root handle 100 taprio \
>> num_tc 4 \
>> map 3 2 1 0 3 3 3 3 3 3 3 3 3 3 3 3 \
>> queues 1@0 1@1 1@2 1@3 \
>> base-time 0 \
>> cycle-time-extension 1000000 \
>> sched-entry S 09 500000 \
>> sched-entry S 0a 500000 \
>> clockid CLOCK_TAI
>
> Why do you need PTP sync? Cannot this test run between 2 veth ports?
PTP sync is probably not needed, but the test case already has it (I just
reuse the test case), I assume it's to simulate a complete use case of a
real user.
Let me explore testing using veth ports, haven't tried this before.
>
>> 4) capture tcp dump on Board B
>> 5) Send packets from Board A to Board B with 200us interval via UDP Tai
>
> What is udp_tai? This program?
> https://gist.github.com/jeez/bd3afeff081ba64a695008dd8215866f
>
Yea the base app looks similar to the one that I use, but the one I use is
modified. It's to transmit UDP packets.
>> 6) When packets reached Board B, trigger GCL2 to Board A:
>> CYCLETIME=1000000
>> APPLYTIME=1000000000 # 1s
>> CURRENT=$(date +%s%N)
>> BASE=$(( (CURRENT + APPLYTIME + (2*CYCLETIME)) - ((CURRENT + APPLYTIME)
>> % CYCLETIME) + ((CYCLETIME*3)/5) ))
>> tc qdisc replace dev $INTERFACE parent root handle 100 taprio \
>> num_tc 4 \
>> map 3 2 1 0 3 3 3 3 3 3 3 3 3 3 3 3 \
>> queues 1@0 1@1 1@2 1@3 \
>> base-time $BASE \
>> cycle-time-extension 1000000 \
>> sched-entry S oc 500000 \
>> sched-entry S 08 500000 \
>> clockid CLOCK_TAI
>> 7) Analyze tcp dump data on Board B using wireshark, will observe packets
>> receive pattern changed.
>>
>> Note that I've hidden "Best Effort (default) 7001 → 7001" data from the
>> wireshark log so that it's easier to see the pattern.
>>
>> TIMESTAMP PRIORITY PRIORITY NOTES
>>
>> 1702896645.925014509 Critical Applications 7004 → 7004 GCL1
>> 1702896645.925014893 Critical Applications 7004 → 7004 GCL1
>> 1702896645.925514454 Excellent Effort 7003 → 7003 GCL1
>> 1702896645.925514835 Excellent Effort 7003 → 7003 GCL1
>> 1702896645.926014371 Critical Applications 7004 → 7004 GCL1
>> 1702896645.926014755 Critical Applications 7004 → 7004 GCL1
>> 1702896645.926514620 Excellent Effort 7003 → 7003 GCL1
>> 1702896645.926515004 Excellent Effort 7003 → 7003 GCL1
>> 1702896645.927014408 Critical Applications 7004 → 7004 GCL1
>> 1702896645.927014792 Critical Applications 7004 → 7004 GCL1
>> 1702896645.927514789 Excellent Effort 7003 → 7003 GCL1
>> 1702896645.927515173 Excellent Effort 7003 → 7003 GCL1
>> 1702896645.928168304 Excellent Effort 7003 → 7003 Extended
>> 1702896645.928368780 Excellent Effort 7003 → 7003 Extended
>> 1702896645.928569406 Excellent Effort 7003 → 7003 Extended
>> 1702896645.929614835 Background 7002 → 7002 GCL2
>> 1702896645.929615219 Background 7002 → 7002 GCL2
>> 1702896645.930614643 Background 7002 → 7002 GCL2
>> 1702896645.930615027 Background 7002 → 7002 GCL2
>> 1702896645.931614604 Background 7002 → 7002 GCL2
>> 1702896645.931614991 Background 7002 → 7002 GCL2
>>
>> The extended packets only will happen if cycle_time and interval fields
>> are updated using cycle_time_correction. Without that patch, the extended
>> packets are not received.
>>
>>
>> As for the negative truncation case, I just make the interval quite long,
>> and experimented with GCL2 base-time value so that it hits the "next entry"
>> in advance_sched(). Then I checked my logs in get_cycle_time_correction() to
>> see the truncation case and its values.
>>
>> Based on your feedback of the test required, I think that my existing
>> truncation test is not enough, but the extension test case part should be
>> good right ?
>>
>> Do let me know then, I'm more than willing to do more test for the
>> truncation case as per your suggestion, well basically, anything to help
>> speed up the patches series review process :)
>>
>>
>> Appreciate your suggestion and help a lot, thank you.
>
> Do you think you could automate a test suite which only measures software
> TX timestamps and works on veth?
>
> I prepared this very small patch set just to give you a head start
> (the skeleton). You'll still have to add the logic for individual tests.
> https://lore.kernel.org/netdev/20231221132521.2314811-1-vladimir.oltean@nxp.com/
> I'm terribly sorry, but this is the most I can do due to my current lack
> of spare time, unfortunately.
>
> If you've never run kselftests before, you'll need some kernel options
> to enable VRF support. From my notes I have this list below, but there
> may be more missing options.
>
> CONFIG_IP_MULTIPLE_TABLES=y
> CONFIG_NET_L3_MASTER_DEV=y
> CONFIG_NET_VRF=y
>
> Let me know if you face any trouble or if I can help in some way.
> Thanks for doing this.
Thank you so much for helping with this self test skeleton ! I'll explore
and continue from where you've left. Appreciate it.
Powered by blists - more mailing lists