[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AM0PR04MB67541E97634368B5B2CF1D5C961A0@AM0PR04MB6754.eurprd04.prod.outlook.com>
Date: Fri, 23 Oct 2020 11:37:44 +0000
From: Claudiu Manoil <claudiu.manoil@....com>
To: Jakub Kicinski <kuba@...nel.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"David S . Miller" <davem@...emloft.net>,
"james.jurack@...tek.com" <james.jurack@...tek.com>
Subject: RE: [PATCH net] gianfar: Account for Tx PTP timestamp in the skb
headroom
>-----Original Message-----
>From: Jakub Kicinski <kuba@...nel.org>
>Sent: Friday, October 23, 2020 5:55 AM
>To: Claudiu Manoil <claudiu.manoil@....com>
>Cc: netdev@...r.kernel.org; David S . Miller <davem@...emloft.net>;
>james.jurack@...tek.com
>Subject: Re: [PATCH net] gianfar: Account for Tx PTP timestamp in the skb
>headroom
>
>On Thu, 22 Oct 2020 12:09:00 +0000 Claudiu Manoil wrote:
>> >-----Original Message-----
>> >From: Jakub Kicinski <kuba@...nel.org>
>> >Sent: Wednesday, October 21, 2020 9:00 PM
>> >To: Claudiu Manoil <claudiu.manoil@....com>
>> >Cc: netdev@...r.kernel.org; David S . Miller <davem@...emloft.net>;
>> >james.jurack@...tek.com
>> >Subject: Re: [PATCH net] gianfar: Account for Tx PTP timestamp in the skb
>> >headroom
>> >
>> >On Tue, 20 Oct 2020 20:36:05 +0300 Claudiu Manoil wrote:
>> [...]
>> >>
>> >> if (dev->features & NETIF_F_IP_CSUM ||
>> >> priv->device_flags & FSL_GIANFAR_DEV_HAS_TIMER)
>> >> - dev->needed_headroom = GMAC_FCB_LEN;
>> >> + dev->needed_headroom = GMAC_FCB_LEN +
>GMAC_TXPAL_LEN;
>> >>
>> >> /* Initializing some of the rx/tx queue level parameters */
>> >> for (i = 0; i < priv->num_tx_queues; i++) {
>> >
>> >Claudiu, I think this may be papering over the real issue.
>> >needed_headroom is best effort, if you were seeing crashes
>> >the missing checks for skb being the right geometry are still
>> >out there, they just not get hit in the case needed_headroom
>> >is respected.
>> >
>> >So updating needed_headroom is definitely fine, but the cause of
>> >crashes has to be found first.
>>
>> I agree Jakub, but this is a simple (and old) ring based driver. So the
>> question is what checks or operations may be missing or be wrong
>> in the below sequence of operations made on the skb designated for
>> timestamping?
>> As hinted in the commit description, the code is not crashing when
>> multiple TCP streams are transmitted alone (skb frags manipulation was
>> omitted for brevity below, and this is a well exercised path known to work),
>> but only crashes when multiple TCP stream are run concurrently
>> with a PTP Tx packet flow taking the skb_realloc_headroom() path.
>> I don't find other drivers doing skb_realloc_headroom() for PTP packets,
>> so maybe is this an untested use case of skb usage?
>>
>> init:
>> dev->needed_headroom = GMAC_FCB_LEN;
>>
>> start_xmit:
>> netdev_tx_t gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
>> {
>> do_tstamp = (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
>> priv->hwts_tx_en;
>> fcb_len = GMAC_FCB_LEN; // can also be 0
>> if (do_tstamp)
>> fcb_len = GMAC_FCB_LEN + GMAC_TXPAL_LEN;
>>
>> if (skb_headroom(skb) < fcb_len) {
>> skb_new = skb_realloc_headroom(skb, fcb_len);
>> if (!skb_new) {
>> dev_kfree_skb_any(skb);
>> return NETDEV_TX_OK;
>> }
>> if (skb->sk)
>> skb_set_owner_w(skb_new, skb->sk);
>> dev_consume_skb_any(skb);
>> skb = skb_new;
>> }
>
>Try replacing this if () with skb_cow_head(). The skbs you're getting
>are probably cloned.
>
That seems to be it(!), Jakub. With this change the test that was failing
immediately before passes now. I'm going to do some more tests, and
I'll send 2 stable fixes for v2 if everything is fine.
Not sure if this particular breakage was triggered by cloned skbs, though
probably timestamping skbs can get cloned, but could also be other aspect
/ side effect of using skb_realloc_headroom/skb_set_owner_w vs skb_cow_head,
as the 2 APIs differ in many ways.
Thanks for the help.
Regards,
Claudiu
Powered by blists - more mailing lists