lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190325161031.GH9224@smile.fi.intel.com>
Date:   Mon, 25 Mar 2019 18:10:31 +0200
From:   Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To:     "wanghai (M)" <wanghai26@...wei.com>
Cc:     syzbot <syzbot+6024817a931b2830bc93@...kaller.appspotmail.com>,
        alexander.h.duyck@...el.com, amritha.nambiar@...el.com,
        davem@...emloft.net, dmitry.torokhov@...il.com,
        f.fainelli@...il.com, idosch@...lanox.com, joe@...ches.com,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        stephen@...workplumber.org, syzkaller-bugs@...glegroups.com,
        tyhicks@...onical.com, yuehaibing@...wei.com
Subject: Re: kernel BUG at net/core/net-sysfs.c:LINE!

On Mon, Mar 25, 2019 at 11:20:01PM +0800, wanghai (M) wrote:
> thanks , Can it be fixed like this?

I dunno. I think no, it can't.

As far as I can see the issue happened due to freeing entire network device at
the point of putting reference count to the device (struct device is embedded
into struct net_device).

When it happens the access to _any_ field of struct net_device will crash the
system.

Basically it means that put_device() should be carefully placed case-by-case,
because on real hardware the actual device is parent and usually no-one does
access to the child without need. On the contrary the tunX devices are
artificial and are controlled by the network stack.

So, it means we need to do something like

ret = register_netdev(...);
if (ret) {
	put_device(&ndev->dev);
	...
}

But as I mentioned, it would be tricky to not break something else.

P.S. It might be I have missed something, I'm not an expert in network stack.

> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 4ff661f..e609c8d 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -1745,16 +1745,21 @@ int netdev_register_kobject(struct net_device *ndev)
> 
>         error = device_add(dev);
>         if (error)
> -               return error;
> +               goto error_put_device;
> 
>         error = register_queue_kobjects(ndev);
> -       if (error) {
> -               device_del(dev);
> -               return error;
> -       }
> +       if (error)
> +               goto error_device_del;
> 
>         pm_runtime_set_memalloc_noio(dev, true);
> 
> +       return 0;
> +
> +error_device_del:
> +       device_del(dev);
> +error_put_device:
> +       ndev->reg_state = NETREG_RELEASED;
> +       put_device(dev);
>         return error;
>  }
> 
> 在 2019/3/24 1:16, Andy Shevchenko 写道:
> > Nice.
> > 
> > I looked briefly in the flow of this report and it looks like the patch above
> > should be reverted.
> > 
> > The problem is not so easy to fix. One approach is to initialize device
> > (and thus kobject) somewhere in alloc_netdev() and put device in free_netdev()
> > respectively, but this might produce more interesting regressions.
> 

-- 
With Best Regards,
Andy Shevchenko


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ