[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250114095541.000000a1@huawei.com>
Date: Tue, 14 Jan 2025 09:55:41 +0000
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
CC: <shiju.jose@...wei.com>, <linux-edac@...r.kernel.org>,
<linux-cxl@...r.kernel.org>, <linux-acpi@...r.kernel.org>,
<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>, <bp@...en8.de>,
<tony.luck@...el.com>, <rafael@...nel.org>, <lenb@...nel.org>,
<mchehab@...nel.org>, <dan.j.williams@...el.com>, <dave@...olabs.net>,
<dave.jiang@...el.com>, <alison.schofield@...el.com>,
<vishal.l.verma@...el.com>, <ira.weiny@...el.com>, <david@...hat.com>,
<Vilas.Sridharan@....com>, <leo.duran@....com>, <Yazen.Ghannam@....com>,
<rientjes@...gle.com>, <jiaqiyan@...gle.com>, <Jon.Grimm@....com>,
<dave.hansen@...ux.intel.com>, <naoya.horiguchi@....com>,
<james.morse@....com>, <jthoughton@...gle.com>, <somasundaram.a@....com>,
<erdemaktas@...gle.com>, <pgonda@...gle.com>, <duenwen@...gle.com>,
<gthelen@...gle.com>, <wschwartz@...erecomputing.com>,
<dferguson@...erecomputing.com>, <wbs@...amperecomputing.com>,
<nifan.cxl@...il.com>, <tanxiaofei@...wei.com>, <prime.zeng@...ilicon.com>,
<roberto.sassu@...wei.com>, <kangkang.shen@...urewei.com>,
<wanghuiqiang@...wei.com>, <linuxarm@...wei.com>
Subject: Re: [PATCH v18 01/19] EDAC: Add support for EDAC device features
control
> > +int edac_dev_register(struct device *parent, char *name,
> > + void *private, int num_features,
> > + const struct edac_dev_feature *ras_features)
> > +{
> > + const struct attribute_group **ras_attr_groups;
> > + struct edac_dev_feat_ctx *ctx;
> > + int attr_gcnt = 0;
> > + int ret, feat;
> > +
> > + if (!parent || !name || !num_features || !ras_features)
> > + return -EINVAL;
> > +
> > + /* Double parse to make space for attributes */
> > + for (feat = 0; feat < num_features; feat++) {
> > + switch (ras_features[feat].ft_type) {
> > + /* Add feature specific code */
> > + default:
> > + return -EINVAL;
> > + }
> > + }
> > +
> > + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> > + if (!ctx)
> > + return -ENOMEM;
> > +
> > + ras_attr_groups = kcalloc(attr_gcnt + 1, sizeof(*ras_attr_groups), GFP_KERNEL);
> > + if (!ras_attr_groups) {
> > + ret = -ENOMEM;
> > + goto ctx_free;
> > + }
> > +
> > + attr_gcnt = 0;
> > + for (feat = 0; feat < num_features; feat++, ras_features++) {
> > + switch (ras_features->ft_type) {
> > + /* Add feature specific code */
> > + default:
> > + ret = -EINVAL;
> > + goto groups_free;
> > + }
> > + }
> > +
> > + ctx->dev.parent = parent;
> > + ctx->dev.bus = edac_get_sysfs_subsys();
> > + ctx->dev.type = &edac_dev_type;
> > + ctx->dev.groups = ras_attr_groups;
> > + ctx->private = private;
> > + dev_set_drvdata(&ctx->dev, ctx);
> > +
> > + ret = dev_set_name(&ctx->dev, name);
> > + if (ret)
> > + goto groups_free;
> > +
> > + ret = device_register(&ctx->dev);
> > + if (ret) {
> > + put_device(&ctx->dev);
>
> > + return ret;
>
> As register failed, you need to change it to a goto groups_free,
> as edac_dev_release() won't be called.
Boris called this one out as well, so seems it is not that well understood.
I've also tripped over this in the past and it's one of the most common
things I catch in reviews of code calling this stuff.
As discussed offline, it will be called. The device_register() docs
make it clear that whether or not that call succeeds reference counting
is enabled and put_device() is the correct way to free resources.
The actual depends on the fact that device_register() is just a helper
defined as
device_initialize();
return device_add();
So for reasons lost to history (I guess there are cases where other cleanup
needs to happen before the release) it does not handle side effects
of device_initialize() on an error in device_add().
device_initialize() has called
-> kobject_init(&dev->kobj, &device_type);
-> kref_init_internal(kobj) + sets ktype (which has the release callback)
kref_init_internal() sets the reference counter to 1
Hence when we do a device_put() in the error path, the reference counter drops
to 0 and the release from the ktype is called. Here that is edac_dev_release();
If you want to verify replace device_register() with device_initialize() then
call put_device().
If we were going back in history, I'd suggest device_register() should be side
effect free and call put_device() on error and any driver that needs to handle
other stuff before the release should just not use it. I guess that ship
long sailed and maybe there are other reasons I've not thought of.
I took a quick look and seems to go back into at least the 2.5 era.
Jonathan
Powered by blists - more mailing lists