linux-kernel - Re: [PATCH net v2 2/2] net: ethernet: oa_tc6: fix tx skb race condition between reference pointers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba984578-318c-4bf3-8ffb-875ab851ee0e@microchip.com>
Date: Mon, 2 Dec 2024 03:37:01 +0000
From: <Parthiban.Veerasooran@...rochip.com>
To: <pabeni@...hat.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<UNGLinuxDriver@...rochip.com>, <jacob.e.keller@...el.com>,
	<andrew+netdev@...n.ch>, <davem@...emloft.net>, <edumazet@...gle.com>,
	<kuba@...nel.org>
Subject: Re: [PATCH net v2 2/2] net: ethernet: oa_tc6: fix tx skb race
 condition between reference pointers

Hi Paolo,

Does the below reply clarifies your question and shall I proceed to 
prepare the next version? OR still do you have any comments on that?

Best regards,
Parthiban V
On 28/11/24 11:13 am, Parthiban.Veerasooran@...rochip.com wrote:
> Hi Paolo,
> 
> On 27/11/24 5:41 pm, Paolo Abeni wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>
>> On 11/27/24 11:49, Parthiban.Veerasooran@...rochip.com wrote:
>>> On 26/11/24 4:11 pm, Paolo Abeni wrote:
>>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>>>
>>>> On 11/22/24 11:21, Parthiban Veerasooran wrote:
>>>>> There are two skb pointers to manage tx skb's enqueued from n/w stack.
>>>>> waiting_tx_skb pointer points to the tx skb which needs to be processed
>>>>> and ongoing_tx_skb pointer points to the tx skb which is being processed.
>>>>>
>>>>> SPI thread prepares the tx data chunks from the tx skb pointed by the
>>>>> ongoing_tx_skb pointer. When the tx skb pointed by the ongoing_tx_skb is
>>>>> processed, the tx skb pointed by the waiting_tx_skb is assigned to
>>>>> ongoing_tx_skb and the waiting_tx_skb pointer is assigned with NULL.
>>>>> Whenever there is a new tx skb from n/w stack, it will be assigned to
>>>>> waiting_tx_skb pointer if it is NULL. Enqueuing and processing of a tx skb
>>>>> handled in two different threads.
>>>>>
>>>>> Consider a scenario where the SPI thread processed an ongoing_tx_skb and
>>>>> it moves next tx skb from waiting_tx_skb pointer to ongoing_tx_skb pointer
>>>>> without doing any NULL check. At this time, if the waiting_tx_skb pointer
>>>>> is NULL then ongoing_tx_skb pointer is also assigned with NULL. After
>>>>> that, if a new tx skb is assigned to waiting_tx_skb pointer by the n/w
>>>>> stack and there is a chance to overwrite the tx skb pointer with NULL in
>>>>> the SPI thread. Finally one of the tx skb will be left as unhandled,
>>>>> resulting packet missing and memory leak.
>>>>> To overcome the above issue, protect the moving of tx skb reference from
>>>>> waiting_tx_skb pointer to ongoing_tx_skb pointer so that the other thread
>>>>> can't access the waiting_tx_skb pointer until the current thread completes
>>>>> moving the tx skb reference safely.
>>>>
>>>> A mutex looks overkill. Why don't you use a spinlock? why locking only
>>>> one side (the writer) would be enough?
>>> Ah my bad, missed to protect tc6->waiting_tx_skb = skb. So that it will
>>> become like below,
>>>
>>> mutex_lock(&tc6->tx_skb_lock);
>>> tc6->waiting_tx_skb = skb;
>>> mutex_unlock(&tc6->tx_skb_lock);
>>>
>>> As both are not called from atomic context and they are allowed to
>>> sleep, I used mutex rather than spinlock.
>>>>
>>>> Could you please report the exact sequence of events in a time diagram
>>>> leading to the bug, something alike the following?
>>>>
>>>> CPU0                                    CPU1
>>>> oa_tc6_start_xmit
>>>>     ...
>>>>                                            oa_tc6_spi_thread_handler
>>>>                                             ...
>>> Good case:
>>> ----------
>>> Consider waiting_tx_skb is NULL.
>>>
>>> Thread1 (oa_tc6_start_xmit)   Thread2 (oa_tc6_spi_thread_handler)
>>> ---------------------------   -----------------------------------
>>> - if waiting_tx_skb is NULL
>>> - waiting_tx_skb = skb
>>>                                 - if ongoing_tx_skb is NULL
>>>                                 - ongoing_tx_skb = waiting_tx_skb
>>>                                 - waiting_tx_skb = NULL
>>>                                 ...
>>>                                 - ongoing_tx_skb = NULL
>>> - if waiting_tx_skb is NULL
>>> - waiting_tx_skb = skb
>>>                                 - if ongoing_tx_skb is NULL
>>>                                 - ongoing_tx_skb = waiting_tx_skb
>>>                                 - waiting_tx_skb = NULL
>>>                                 ...
>>>                                 - ongoing_tx_skb = NULL
>>> ....
>>>
>>> Bad case:
>>> ---------
>>> Consider waiting_tx_skb is NULL.
>>>
>>> Thread1 (oa_tc6_start_xmit)   Thread2 (oa_tc6_spi_thread_handler)
>>> ---------------------------   -----------------------------------
>>> - if waiting_tx_skb is NULL
>>> - waiting_tx_skb = skb
>>>                                 - if ongoing_tx_skb is NULL
>>
>> AFAICS, if 'waiting_tx_skb == NULL and Thread2 is in
>> oa_tc6_spi_thread_handler()/oa_tc6_prepare_spi_tx_buf_for_tx_skbs()
>> then ongoing_tx_skb can not be NULL, due to the previous check in:
>>
>> https://elixir.bootlin.com/linux/v6.12/source/drivers/net/ethernet/oa_tc6.c#L1064
>>
>> This looks like a single reader/single write scenarios that does not
>> need any lock to ensure consistency.
>>
>> Do you observe any memory leak in real life scenarios?
>>
>> BTW it looks like both oa_tc6_start_xmit and oa_tc6_spi_thread_handler
>> are possibly lacking memory barriers to avoid missing wake-ups.
> Actually the transmit flow control is done using the TXC reported from
> MAC-PHY and it is done in the below for loop. TXC is Transmit Credit
> Count represents the rooms available to place the tx chunks in the MAC-PHY.
> 
> https://elixir.bootlin.com/linux/v6.12/source/drivers/net/ethernet/oa_tc6.c#L1004
> 
> - Consider a scenario where the TXC reported from the previous transfer
> is 10 and ongoing_tx_skb holds an tx ethernet frame which can be
> transported in 20 TXCs and waiting_tx_skb is still NULL.
> 	tx_credits = 10; /* 21 are filled in the previous transfer */
> 	ongoing_tx_skb = 20;
> 	waiting_tx_skb = NULL; /* Still NULL */
> - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true.
> - After oa_tc6_prepare_spi_tx_buf_for_tx_skbs()
> 	ongoing_tx_skb = 10;
> 	waiting_tx_skb = NULL; /* Still NULL */
> - Perform SPI transfer.
> - Process SPI rx buffer to get the TXC from footers.
> - Now let's assume previously filled 21 TXCs are freed so we are good to
> transport the next remaining 10 tx chunks from ongoing_tx_skb.
> 	tx_credits = 21;
> 	ongoing_tx_skb = 10;
> 	waiting_tx_skb = NULL;
> - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true again.
> - In the oa_tc6_prepare_spi_tx_buf_for_tx_skbs()
> 	ongoing_tx_skb = NULL;
> 	waiting_tx_skb = NULL;
> 
> Now the below bad case might happen,
> 
> Thread1 (oa_tc6_start_xmit)	Thread2 (oa_tc6_spi_thread_handler)
> ---------------------------	-----------------------------------
> - if waiting_tx_skb is NULL
> 				- if ongoing_tx_skb is NULL
> 				- ongoing_tx_skb = waiting_tx_skb
> - waiting_tx_skb = skb
> 				- waiting_tx_skb = NULL
> 				...
> 				- ongoing_tx_skb = NULL
> - if waiting_tx_skb is NULL
> - waiting_tx_skb = skb
> 
> Hope this clarifies.
> 
> Best regards,
> Parthiban V
>>
>> Cheers,
>>
>> Paolo
>>
>