[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJieiUhi_7=nEPiqweFVwiuh3bPNJvP_Rag29ADmUfpTG0XX8Q@mail.gmail.com>
Date: Mon, 17 Dec 2018 07:45:30 -0800
From: Roopa Prabhu <roopa@...ulusnetworks.com>
To: Petr Machata <petrm@...lanox.com>
Cc: netdev <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Ido Schimmel <idosch@...lanox.com>
Subject: Re: [PATCH net 2/5] vxlan: Don't double-free default FDB entry in __vxlan_dev_create()
On Mon, Dec 17, 2018 at 7:39 AM Petr Machata <petrm@...lanox.com> wrote:
>
> Roopa Prabhu <roopa@...ulusnetworks.com> writes:
>
> > On Mon, Dec 17, 2018 at 6:58 AM Petr Machata <petrm@...lanox.com> wrote:
> >>
> >> When a failure occurs in rtnl_configure_link(), the current code
> >> calls unregister_netdevice() to roll back the earlier call to
> >> register_netdevice(), and jumps to errout, which calls
> >> vxlan_fdb_destroy().
> >>
> >> However unregister_netdevice() calls transitively ndo_uninit(),
> >> which is vxlan_uninit(), and that already takes care of deleting
> >> the default FDB entry by calling vxlan_fdb_delete_default().
> >> Since the entry added earlier in __vxlan_dev_create() is exactly
> >> the default entry, the cleanup code in the errout block always
> >> leads to double free and thus a panic.
> >>
> >> Fix by skipping the rollback code and just returning.
> >
> > I trust your test case that pointed to a potential issue..., but can
> > we add a comment there on
> > why there is an fdb add but no destroy in the roll back code ?. Also,
>
> This is at a place in the code where the clean-up would normally be:
>
> >> + /* the all_zeros_mac entry was deleted at vxlan_uninit */
>
> Is that what you meant?
>
> > the destroy code from uninit path does
> > look for an entry being present before attempting a destroy, its
> > unclear to me yet why that ends up doing a double free ?
>
> That code is called first, finds the FDB entry, destroys it. After it
> returns, another destroy call is invoked from __vxlan_dev_create().
>
ok, I am wondering if we can just avoid calling the fdb destroy in
error path if unregister has already happened on the device with a
flag local to that function or add extra check for the fdb being
present before calling destroy in the error path. This keeps this fdb
destroy checks local to this function.
Powered by blists - more mailing lists