[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABQgh9HuOB9Zcv277Sstdubz7q-Be8+aGeoNE5xrTfkjdz0u+A@mail.gmail.com>
Date: Fri, 28 Nov 2025 10:08:44 +0800
From: Zhangfei Gao <zhangfei.gao@...aro.org>
To: Christian Brauner <brauner@...nel.org>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Alexander Viro <viro@...iv.linux.org.uk>,
Wenkai Lin <linwenkai6@...ilicon.com>, Chenghai Huang <huangchenghai2@...wei.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] chardev: fix consistent error handling in cdev_device_add
Hi, Christian
On Wed, 26 Nov 2025 at 19:01, Zhangfei Gao <zhangfei.gao@...aro.org> wrote:
>
> Hi,Christian
>
> On Wed, 26 Nov 2025 at 18:27, Christian Brauner <brauner@...nel.org> wrote:
> >
> > On Wed, Nov 19, 2025 at 10:15:40AM +0000, Zhangfei Gao wrote:
> > > Currently cdev_device_add has inconsistent error handling:
> > >
> > > - If device_add fails, it calls cdev_del(cdev)
> > > - If cdev_add fails, it only returns error without cleanup
> > >
> > > This creates a problem because cdev_set_parent(cdev, &dev->kobj)
> > > establishes a parent-child relationship.
> > > When callers use cdev_del(cdev) to clean up after cdev_add failure,
> > > it also decrements the dev's refcount due to the parent relationship,
> > > causing refcount mismatch.
> > >
> > > To unify error handling:
> > > - Set cdev->kobj.parent = NULL first to break the parent relationship
> > > - Then call cdev_del(cdev) for cleanup
> > >
> > > This ensures that in both error paths,
> > > the dev's refcount remains consistent and callers don't need
> > > special handling for different failure scenarios.
> > >
> > > Signed-off-by: Zhangfei Gao <zhangfei.gao@...aro.org>
> > > ---
> > > fs/char_dev.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/char_dev.c b/fs/char_dev.c
> > > index c2ddb998f3c9..fef6ee1aba66 100644
> > > --- a/fs/char_dev.c
> > > +++ b/fs/char_dev.c
> > > @@ -549,8 +549,11 @@ int cdev_device_add(struct cdev *cdev, struct device *dev)
> > > cdev_set_parent(cdev, &dev->kobj);
> > >
> > > rc = cdev_add(cdev, dev->devt, 1);
> > > - if (rc)
> > > + if (rc) {
> > > + cdev->kobj.parent = NULL;
> > > + cdev_del(cdev);
> > > return rc;
> > > + }
> >
> > There are callers that call cdev_del() on failure of cdev_add():
> >
> > retval = cdev_add(&dvb_device_cdev, dev, MAX_DVB_MINORS);
> > if (retval != 0) {
> > pr_err("dvb-core: unable register character device\n");
> > goto error;
> > }
> >
> > <snip>
> >
> > error:
> > cdev_del(&dvb_device_cdev);
> > unregister_chrdev_region(dev, MAX_DVB_MINORS);
> > return retval;
> >
> > and there are callers that don't. If you change the scheme here then all
> > of these callers need to be adjusted as well - including the one that
> > does a kobject_put() directly...
>
> The situation with cdev_device_add() is different from the standalone
> cdev_add() callers.
>
> cdev_device_add() explicitly establishes a parent-child relationship via
> cdev_set_parent(cdev, &dev->kobj).
> If we simply call cdev_del() on failure here, it will drop the reference
> count of the parent (dev), causing a refcount mismatch.
>
> Direct callers of cdev_add() (like the DVB example) generally do not
> set this parent relationship beforehand,
> so they do not suffer from this specific refcount issue.
>
> This patch aims to fix the cleanup specifically
> within the cdev_device_add() helper.
>
More explanation
Now the code:
int cdev_device_add(struct cdev *cdev, struct device *dev)
{
int rc = 0;
if (dev->devt) {
cdev_set_parent(cdev, &dev->kobj); // here set parent
rc = cdev_add(cdev, dev->devt, 1);
if (rc)
return rc; // case 1
}
rc = device_add(dev);
if (rc && dev->devt)
cdev_del(cdev); // case 2
return rc;
}
Since the cdev_set_parent,
cdev_add will increase parent refcount,
and cdev_del will decrease parent refcount.
So case 2, no problem.
But case 1 has an issue,
if cdev_add fails, it does not increase parent refcount.
If the caller calls cdev_del to handle the error case, the parent
refcount will be decreased
since there is a parent relation, and finally got refcount_t:
underflow; use-after-free.
So for case 1, cdev_add fails, it does not increase parent refcount.
needs to break the parent relation first.
then cdev_del does not decrease parent (dev) refcount.
Finally, both case 1 and case 2, calls cdev_del for error handling,
and refcount is correct.
It is easy for the caller, otherwise it is difficult to distinguish
case 1 or case 2.
Thanks
Powered by blists - more mailing lists