linux-kernel - Re: [bug report] node: Add memory-side caching attributes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210401152213.GI1463678@nvidia.com>
Date:   Thu, 1 Apr 2021 12:22:13 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Dan Carpenter <dan.carpenter@...cle.com>,
        Christoph Hellwig <hch@....de>
Cc:     kbusch@...nel.org, linux-kernel@...r.kernel.org,
        kernel-janitors@...r.kernel.org
Subject: Re: [bug report] node: Add memory-side caching attributes

On Thu, Apr 01, 2021 at 05:06:52PM +0300, Dan Carpenter wrote:

> > diff --git a/drivers/base/node.c b/drivers/base/node.c
> > index f449dbb2c74666..89c28952863977 100644
> > +++ b/drivers/base/node.c
> > @@ -319,25 +319,24 @@ void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs)
> >  		return;
> >  
> >  	dev = &info->dev;
> > +	device_initialize(dev)
> >  	dev->parent = node->cache_dev;
> >  	dev->release = node_cacheinfo_release;
> >  	dev->groups = cache_groups;
> >  	if (dev_set_name(dev, "index%d", cache_attrs->level))
> 
> Is calling dev_set_name() without doing a device_initialize() a bug?  I
> could write a check for that.

IMHO, yes.

However, Greg may not agree as dev_set_name() with no error check
followed by device_register() is a very common pattern. If the user
omits the device_initialize() then dev_set_name() must be immediately
before only device_register() with no error unwind between them. It
must not error unwind dev_set_name() to kfree. (This is really
tricky)

Greg and I have argued on the merits of device_initialize() several
times before.

I argue the error control flow is simpler and easier to get right, he
argues the extra statement is unneeded complexity and people will get
the error unwind wrong.

Every time you find a bug like this is someone getting the complexity
around error handling and device_register() wrong, so my advices is to
stop using device_register (aka device_init_and_add) and make it
explicit so the goto unwind has logical pairing. put_device() pairs
with device_initialize().

The tricky bit is establishing the release function, as complex
release functions often have complex init sequences and you need the
setup done enough to go to release. When things become this complex I
advocate splitting alloc into a function:

/* Caller must use put_device(&foo->dev) */
struct foo *foo_allocate()
{ 
   foo = kzalloc
   allocate release freeing thing 1;
   allocate release freeing thing 2;
   allocate release freeing thing 3;
   device_initialize();
   return foo;

err:
   free thing 3;
err:
   free thing 2;
err:
   free thing 1;
err:
   kfree(foo)
   return rc;
}

Thus there is a clear logical seperation between the world of 'unwind
to kfree' and the world of 'unwind to put_device'. dev_set_name() can
not be inside an alloc function.

Simple cases, like here, should just do device_initialize()
immediately after kzalloc() and never have a kfree() error unwind.

I saw Christoph had a similar opinion on 'init and add' patterns being
bad, maybe he has additional colour to share.

Jason