[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190315230457.094e2939@elisabeth>
Date: Fri, 15 Mar 2019 23:04:57 +0100
From: Stefano Brivio <sbrivio@...hat.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>, liuzhiqiang26@...wei.com,
petrm@...lanox.com, idosch@...lanox.com, sd@...asysnail.net,
mousuanming@...wei.com, netdev@...r.kernel.org,
mingfangsen@...wei.com, zhoukang7@...wei.com,
wangxiaogang3@...wei.com
Subject: Re: [PATCH v2] vxlan: remove the redundant gro_cells_destroy()
calling.
On Fri, 15 Mar 2019 14:26:10 -0700
Eric Dumazet <eric.dumazet@...il.com> wrote:
> On 03/15/2019 02:08 PM, Stefano Brivio wrote:
> > On Fri, 15 Mar 2019 11:56:01 -0700
> > Eric Dumazet <eric.dumazet@...il.com> wrote:
> >
> >> On 03/15/2019 11:02 AM, David Miller wrote:
> >>> From: Eric Dumazet <eric.dumazet@...il.com>
> >>> Date: Fri, 15 Mar 2019 09:06:25 -0700
> >>>
> >>>>
> >>>>
> >>>> On 03/15/2019 08:28 AM, Stefano Brivio wrote:
> >>>>> On Fri, 15 Mar 2019 23:18:52 +0800
> >>>>> Zhiqiang Liu <liuzhiqiang26@...wei.com> wrote:
> >>>>>
> >>>>>> In vxlan_destroy_tunnels func, unregister_netdevice_queue is called after
> >>>>>> gro_cells_destroy func. However, in unregister_netdevice_queue func, the
> >>>>>> gro_cells_destroy func will also call the gro_cells_destroy func as the
> >>>>>> following routine:
> >>>>>> unregister_netdevice_many() -> rollback_registered_many()
> >>>>>> -> ndo_uninit() -> gro_cells_destroy()
> >>>>>>
> >>>>>> Signed-off-by: Suanming.Mou <mousuanming@...wei.com>
> >>>>>> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@...wei.com>
> >>>>>> Reviewed-by: Stefano Brivio <sbrivio@...hat.com>
> >>>>>
> >>>>> NACK, please read my and Eric's comments to v1 -- giving me more than 23
> >>>>> minutes to answer would have been a nice touch as well :)
> >>>>>
> >>>>
> >>>> Sorry for the confusion, I forgot to add the question marks to my sentences.
> >>>>
> >>>> In fact, this is a bug fix, that we missed in the previous fix.
> >>>>
> >>>> Technically the bug is older.
> >>>
> >>> Please elaborate.
> >>>
> >>
> >> Commit ad6c9986bcb62
> >> ("vxlan: Fix GRO cells race condition between receive and link delete")
> >>
> >> fixed a race condition for the typical case a vxlan device is dismantled from the
> >> current netns.
> >>
> >> But if a netns is dismantled, we call vxlan_destroy_tunnels()
> >> to schedule a unregister_netdevice_queue() of all the vxlan tunnels
> >> that are related to this netns.
> >
> > Won't that happen via ops_exit_list() only after synchronize_rcu() is
> > called by cleanup_net(), though? Is there another path I missed?
>
> Just look at vxlan_destroy_tunnels().
>
> The call to gro_cells_destroy(&vxlan->gro_cells);
> is done _before_
> unregister_netdevice_queue(vxlan->dev, head);
>
> So packets can still fly, the RCU grace period has not yet started.
Wait, what... :/ thanks for pointing that out, I guess it was too
obvious for me to notice.
Zhiqiang, could you maybe update the commit message with these two bits
of information (the real issue explained by Eric, and the different
Fixes: tag), and post v3?
This would be an actual fix and not a clean-up, so it doesn't need to
wait for net-next to re-open.
--
Stefano
Powered by blists - more mailing lists