[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLrVzT38P_uVCwQKFYVeQeSTOsaeA7T57KnThKgrAsYiQ@mail.gmail.com>
Date: Thu, 6 Jul 2023 17:15:10 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: "Ziyang Xuan (William)" <william.xuanziyang@...wei.com>
Cc: netdev <netdev@...r.kernel.org>, David Miller <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
hannes@...essinduktion.org, fbl@...hat.com
Subject: Re: [Question] WARNING: refcount bug in addrconf_ifdown
On Thu, Jul 6, 2023 at 3:06 PM Ziyang Xuan (William)
<william.xuanziyang@...wei.com> wrote:
>
> Hello all,
>
> We got the following WARNING several times in our ci:
>
> ------------[ cut here ]------------
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 3 PID: 9 at lib/refcount.c:28 refcount_warn_saturate+0x210/0x330
> ...
> Call trace:
> refcount_warn_saturate+0x210/0x330
> addrconf_ifdown.isra.0+0x1be8/0x1e10
> addrconf_notify+0xa8/0xcf0
> raw_notifier_call_chain+0x90/0x10c
> call_netdevice_notifiers_info+0x9c/0x15c
> unregister_netdevice_many+0x3e4/0x980
> default_device_exit_batch+0x24c/0x2a0
> ops_exit_list+0xcc/0xe4
> cleanup_net+0x2b8/0x550
> process_one_work+0x478/0xb54
> worker_thread+0x120/0x95c
> kthread+0x20c/0x25c
> ret_from_fork+0x10/0x18
>
> The code where the problem occurred is as follows:
>
> static int addrconf_ifdown(struct net_device *dev, bool unregister)
> {
> ...
>
> /* Last: Shot the device (if unregistered) */
> if (unregister) {
> addrconf_sysctl_unregister(idev);
> neigh_parms_release(&nd_tbl, idev->nd_parms);
> neigh_ifdown(&nd_tbl, dev);
> in6_dev_put(idev); // WARNING here for idev->refcnt
> }
> return 0;
> }
>
> Because we enabled KASAN, and no UAF issues reported on idev. So I thought
> the last decrement of idev->refcnt must be by __in6_dev_put() which is just
> decrement no memory free for idev. And idev was not be freed.
>
> The functions that call __in6_dev_put() are addrconf_del_rs_timer(),
> mld_gq_stop_timer(), mld_ifc_stop_timer(), mld_dad_stop_timer(). They
> are all related to timer. I compared the mod_timer functions corresponding
> to these functions. I found that addrconf_mod_rs_timer() is suspicious.
> Analyse as below:
>
> static void addrconf_mod_rs_timer(struct inet6_dev *idev,
> unsigned long when)
> {
> /* rs_timer is pending at time A, condition not established, no in6_dev_hold() */
> if (!timer_pending(&idev->rs_timer))
> in6_dev_hold(idev);
>
> /* rs_timer is not pending when do the following at time B.
> * rs_timer callback addrconf_rs_timer() will be executed later,
> * and in6_dev_put() will be executed in addrconf_rs_timer(),
> * but this is wrong. idev->refcnt has been decreased more one.
> */
> mod_timer(&idev->rs_timer, jiffies + when);
> }
>
> The following implementation for addrconf_mod_rs_timer() is more reasonable,
> and avoid the above potential problem.
>
> static void addrconf_mod_rs_timer(struct inet6_dev *idev,
> unsigned long when)
> {
> if (!mod_timer(&idev->rs_timer, jiffies + when))
> in6_dev_hold(idev);
> }
>
> Because the problem is low probability, and I could not reproduce until now.
> I am not entirely sure that the problem is the cause of my analysis.
>
> Do you think my analysis is reasonable? And do you have more ideas for the problem?
>
> Welcome to give me feedback. Thank you for your help!
I think this makes a lot of sense,
A similar issue was fixed in commit f8a894b218138 ("ipv6: fix calling
in6_ifa_hold incorrectly for dad work")
Please send a formal patch ?
Thanks !
Powered by blists - more mailing lists