[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4b5f9ff7-12ab-9402-60c1-8a9ee852700d@gmail.com>
Date: Mon, 9 Mar 2020 19:38:12 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Mahesh Bandewar (महेश बंडेवार) <maheshb@...gle.com>,
Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
Mahesh Bandewar <mahesh@...dewar.net>,
syzbot <syzkaller@...glegroups.com>
Subject: Re: [PATCH net] ipvlan: add cond_resched_rcu() while processing
muticast backlog
On 3/9/20 7:21 PM, Mahesh Bandewar (महेश बंडेवार) wrote:
> On Mon, Mar 9, 2020 at 6:07 PM Eric Dumazet <eric.dumazet@...il.com> wrote:
>>
>>
>>
>> On 3/9/20 3:57 PM, Mahesh Bandewar wrote:
>>> If there are substantial number of slaves created as simulated by
>>> Syzbot, the backlog processing could take much longer and result
>>> into the issue found in the Syzbot report.
>>>
>>
>> ...
>>
>>>
>>> Fixes: ba35f8588f47 (“ipvlan: Defer multicast / broadcast processing to a work-queue”)
>>> Signed-off-by: Mahesh Bandewar <maheshb@...gle.com>
>>> Reported-by: syzbot <syzkaller@...glegroups.com>
>>> ---
>>> drivers/net/ipvlan/ipvlan_core.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
>>> index 53dac397db37..5759e91dec71 100644
>>> --- a/drivers/net/ipvlan/ipvlan_core.c
>>> +++ b/drivers/net/ipvlan/ipvlan_core.c
>>> @@ -277,6 +277,7 @@ void ipvlan_process_multicast(struct work_struct *work)
>>> }
>>> ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, true);
>>> local_bh_enable();
>>> + cond_resched_rcu();
>>
>> This does not work : If you release rcu_read_lock() here,
>> then the surrounding loop can not be continued without risking use-after-free
>>
> .. but cond_resched_rcu() is nothing but
> rcu_read_unlock(); cond_resched(); rcu_read_lock();
>
> isn't that sufficient?
It is buggy.
Think about iterating a list with a spinlock protection.
Then in the middle of the loop, releasing the spinlock and re-acquiring it.
The cursor in the loop might point to freed memory.
Same for rcu really.
>
>> rcu_read_lock();
>> list_for_each_entry_rcu(ipvlan, &port->ipvlans, pnode) {
>> ...
>> cond_resched_rcu();
>> // after this point bad things can happen
>> }
>>
>>
>> You probably should do instead :
>>
>> diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
>> index 30cd0c4f0be0b4d1dea2c0a4d68d0e33d1931ebc..57617ff5565fb87035c13dcf1de9fa5431d04e10 100644
>> --- a/drivers/net/ipvlan/ipvlan_core.c
>> +++ b/drivers/net/ipvlan/ipvlan_core.c
>> @@ -293,6 +293,7 @@ void ipvlan_process_multicast(struct work_struct *work)
>> }
>> if (dev)
>> dev_put(dev);
>> + cond_resched();
>> }
>
> reason this may not work is because the inner loop is for slaves for a
> single packet and if there are 1k slaves, then skb_clone() will be
> called 1k times before doing cond_reched() and the problem may not
> even get mitigated.
The problem that syzbot found is that queuing IPVLAN_QBACKLOG_LIMIT (1000) packets on the backlog
could force the ipvlan_process_multicast() worker to process 1000 packets.
Multiply this by the number of slaves, say 1000 -> 1,000,000 skbs clones.
After the patch, we divide by 1000 the time taken in one invocation,
that should be just good enough.
You do not need to schedule after _each_ clone.
Think about netdev_max_backlog which is set to 1000 : we believe it is fine
to process 1000 packets per round.
Powered by blists - more mailing lists