lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190325191824.GB97916@dtor-ws>
Date:   Mon, 25 Mar 2019 12:18:24 -0700
From:   Dmitry Torokhov <dmitry.torokhov@...il.com>
To:     Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
Cc:     "wanghai (M)" <wanghai26@...wei.com>,
        syzbot <syzbot+6024817a931b2830bc93@...kaller.appspotmail.com>,
        alexander.h.duyck@...el.com, amritha.nambiar@...el.com,
        davem@...emloft.net, f.fainelli@...il.com, idosch@...lanox.com,
        joe@...ches.com, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org, stephen@...workplumber.org,
        syzkaller-bugs@...glegroups.com, tyhicks@...onical.com,
        yuehaibing@...wei.com
Subject: Re: kernel BUG at net/core/net-sysfs.c:LINE!

On Mon, Mar 25, 2019 at 11:55:54AM -0700, Dmitry Torokhov wrote:
> On Mon, Mar 25, 2019 at 06:10:31PM +0200, Andy Shevchenko wrote:
> > On Mon, Mar 25, 2019 at 11:20:01PM +0800, wanghai (M) wrote:
> > > thanks , Can it be fixed like this?
> > 
> > I dunno. I think no, it can't.
> 
> I agree, it can't.
> 
> > 
> > As far as I can see the issue happened due to freeing entire network device at
> > the point of putting reference count to the device (struct device is embedded
> > into struct net_device).
> > 
> > When it happens the access to _any_ field of struct net_device will crash the
> > system.
> > 
> > Basically it means that put_device() should be carefully placed case-by-case,
> > because on real hardware the actual device is parent and usually no-one does
> > access to the child without need. On the contrary the tunX devices are
> > artificial and are controlled by the network stack.
> > 
> > So, it means we need to do something like
> > 
> > ret = register_netdev(...);
> > if (ret) {
> > 	put_device(&ndev->dev);
> > 	...
> > }
> > 
> > But as I mentioned, it would be tricky to not break something else.
> 
> I'd say that the entity that called alloc_netdev() should be the one
> that calls put_device() (but the way of free_netdev()), not net/core
> code. Do we have a driver that is messed up and does not do proper
> cleanup?

OK, looking at this some more, I think we need to set dev->reg_state =
NETREG_REGISTERED earlier, right after successful call to device_add()
as at this point the device is alive as far as device core is concerned.
The queue kobjects have to be managed separately, for that I'd pull the
code out of netdev_register_kobject() and move it into
register_netdevice() and ensure that we clean up there properly.

Thanks.

-- 
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ