netdev - Re: [RFC v2] net: sched: implement TCQ_F_CAN

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3838b7c2-c32f-aeda-702a-5cb8f712ec0c@huawei.com>
Date:   Tue, 16 Mar 2021 08:35:07 +0800
From:   Yunsheng Lin <linyunsheng@...wei.com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     <davem@...emloft.net>, <olteanv@...il.com>, <ast@...nel.org>,
        <daniel@...earbox.net>, <andriin@...com>, <edumazet@...gle.com>,
        <weiwan@...gle.com>, <cong.wang@...edance.com>,
        <ap420073@...il.com>, <netdev@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linuxarm@...neuler.org>,
        <mkl@...gutronix.de>, <linux-can@...r.kernel.org>
Subject: Re: [RFC v2] net: sched: implement TCQ_F_CAN_BYPASS for lockless
 qdisc

On 2021/3/16 2:53, Jakub Kicinski wrote:
> On Mon, 15 Mar 2021 11:10:18 +0800 Yunsheng Lin wrote:
>> @@ -606,6 +623,11 @@ static const u8 prio2band[TC_PRIO_MAX + 1] = {
>>   */
>>  struct pfifo_fast_priv {
>>  	struct skb_array q[PFIFO_FAST_BANDS];
>> +
>> +	/* protect against data race between enqueue/dequeue and
>> +	 * qdisc->empty setting
>> +	 */
>> +	spinlock_t lock;
>>  };
>>  
>>  static inline struct skb_array *band2list(struct pfifo_fast_priv *priv,
>> @@ -623,7 +645,10 @@ static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
>>  	unsigned int pkt_len = qdisc_pkt_len(skb);
>>  	int err;
>>  
>> -	err = skb_array_produce(q, skb);
>> +	spin_lock(&priv->lock);
>> +	err = __ptr_ring_produce(&q->ring, skb);
>> +	WRITE_ONCE(qdisc->empty, false);
>> +	spin_unlock(&priv->lock);
>>  
>>  	if (unlikely(err)) {
>>  		if (qdisc_is_percpu_stats(qdisc))
>> @@ -642,6 +667,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
>>  	struct sk_buff *skb = NULL;
>>  	int band;
>>  
>> +	spin_lock(&priv->lock);
>>  	for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
>>  		struct skb_array *q = band2list(priv, band);
>>  
>> @@ -655,6 +681,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
>>  	} else {
>>  		WRITE_ONCE(qdisc->empty, true);
>>  	}
>> +	spin_unlock(&priv->lock);
>>  
>>  	return skb;
>>  }
> 
> I thought pfifo was supposed to be "lockless" and this change
> re-introduces a lock between producer and consumer, no?

Yes, the lock breaks the "lockless" of the lockless qdisc for now
I do not how to solve the below data race locklessly:

	CPU1:					CPU2:
      dequeue skb				 .
	  .				    	 .	
	  .				    enqueue skb
	  .					 .
	  .			 WRITE_ONCE(qdisc->empty, false);
	  .					 .
	  .					 .
WRITE_ONCE(qdisc->empty, true);

If the above happens, the qdisc->empty is true even if the qdisc has some
skb, which may cuase out of order or packet stuck problem.

It seems we may need to update ptr_ring' status(empty or not) while
enqueuing/dequeuing atomically in the ptr_ring implementation.

Any better idea?

> 
> .
>