netdev - missing retval check of call_netdevice_notifiers in dev_change_net

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHmME9rz0JQfEBfOKkopx6yBbK_gbKVy40rh82exy1d7BZDWGw@mail.gmail.com>
Date:   Sun, 21 Jun 2020 02:58:49 -0600
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Jiri Pirko <jiri@...lanox.com>,
        Johannes Berg <johannes@...solutions.net>,
        David Miller <davem@...emloft.net>,
        Petr Machata <petrm@...lanox.com>,
        David Ahern <dsahern@...il.com>
Cc:     Netdev <netdev@...r.kernel.org>
Subject: missing retval check of call_netdevice_notifiers in dev_change_net_namespace

Hi,

In register_netdevice, there's a call to
call_netdevice_notifiers(NETDEV_REGISTER), whose return value is
checked to determine whether or not to roll back the device
registration. The reason checking that return value is important is
because this represents an opportunity for the various notifiers to
veto the registration, or for another error to happen. It's good that
the error value is checked when that function is called from
register_netdevice. But from other functions, the error value is not
always checked.

The notification is split up into two stages:

        ret = raw_notifier_call_chain(&net->netdev_chain, val, info);
        if (ret & NOTIFY_STOP_MASK)
                return ret;
        return raw_notifier_call_chain(&netdev_chain, val, info);

One is per-net and the other is global. So there's ample space for
something in either chain to abort the whole process.

The wireguard module uses the notifier to keep track of netns changes
in order to do some reference count bookkeeping. If this bookkeeping
goes wrong, there's UaF potential. However, I noticed that the call to
call_netdevice_notifiers(NETDEV_REGISTER) at the end of
dev_change_net_namespace doesn't check its return value and doesn't
implement any sort of rollback like register_netdevice does:

int dev_change_net_namespace(struct net_device *dev, struct net *net,
const char *pat)
{
       struct net *net_old = dev_net(dev);
       int err, new_nsid, new_ifindex;

       ASSERT_RTNL();
[...]
        /* Add the device back in the hashes */
        list_netdevice(dev);

        /* Notify protocols, that a new device appeared. */
        call_netdevice_notifiers(NETDEV_REGISTER, dev);
[...]
}

Notice that call_netdevice_notifiers isn't checking it's return value there.

It seems like if any device vetoes the notification chain, it's bad
news bears for modules that depend on getting a netns change
notification.

I've been trying to audit the various registered notifiers to see if
any of them pose a risk for wireguard. There are also unexpected
errors that can happen, such as OOM conditions for kmalloc(GFP_KERNEL)
or vmalloc and suchlike, which might be influenceable by attackers. In
other words, relying on those notifications always being delivered
seems kind of brittle. Not _super_ brittle, but brittle enough that
it's at the moment making me a bit nervous. (See: UaF potential.)

I've been trying to come up with a good solution to this.

I'm not sure how reasonable it'd be to implement rollback inside of
dev_change_net_namespace, but I guess that could technically be
possible. The way that'd work would be that when vetoed, the function
would complete, but then would start over again (a "goto top" sort of
pattern), with oldnet and newnet reversed. But that could of course
fail too and we could get ourselves in some sort of infinite loop. Not
good.

Another idea would be to have a different notifier for netns changes
(instead of overloading NETDEV_REGISTER as is the case now), whose
entire chain always gets called, where the return value is always
ignored. This seems like it'd be a bit better, and at least explicit
about its purpose. But that'd be a quasi-new thing.

Finally, it could be the solution is that modules that use the netdev
notifier never really rely too heavily upon getting notifications, and
if they need something reliable, then they rearchitect their code to
not need that. I could imagine this attitude holding sway here
somehow, but it's also not very appealing from a code writing and
refactoring perspective.

I figured before I go to town coding one of these up, maybe I should
bring up the issue here, in case anybody has more developed thoughts
than me about it.

Thanks,
Jason