lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Ya6nks3V4/G7aA9Q@google.com>
Date:   Mon, 6 Dec 2021 16:15:14 -0800
From:   Matthias Kaehlcke <mka@...omium.org>
To:     Bjorn Andersson <bjorn.andersson@...aro.org>
Cc:     Sujit Kautkar <sujitka@...omium.org>,
        Andy Gross <agross@...nel.org>,
        Ohad Ben-Cohen <ohad@...ery.com>,
        Sibi Sankar <sibis@...eaurora.org>,
        Stephen Boyd <swboyd@...omium.org>,
        linux-kernel@...r.kernel.org, linux-remoteproc@...r.kernel.org
Subject: Re: [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in
 rpmsg_ctrldev_release_device()

On Wed, Nov 17, 2021 at 05:29:07PM -0600, Bjorn Andersson wrote:
> On Tue 02 Nov 18:51 CDT 2021, Sujit Kautkar wrote:
> 
> I like Stephen's suggestion about modifying the $subject.
> Also note that the change isn't in the glink driver, so prefix should
> reflect that:
> 
> $ git log --oneline --no-decorate -- drivers/rpmsg/rpmsg_char.c
> f998d48f9b3c rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device()
> bc774a3887cb rpmsg: char: Remove useless include
> 964e8bedd5a1 rpmsg: char: Return an error if device already open
> ...
> 
> > Replace cdev add/del APIs with cdev_device_add/cdev_device_del to avoid
> > below kernel warning. This correctly takes a reference to the parent
> > device so the parent will not get released until all references to the
> > cdev are released.
> > 
> > | ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c
> > | WARNING: CPU: 7 PID: 19892 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0
> > | CPU: 7 PID: 19892 Comm: kworker/7:4 Tainted: G        W         5.4.147-lockdep #1
> > | ==================================================================
> > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT)
> > | Workqueue: events kobject_delayed_cleanup
> > | pstate: 60c00009 (nZCv daif +PAN +UAO)
> > | pc : debug_print_object+0x13c/0x1b0
> > | lr : debug_print_object+0x13c/0x1b0
> > | sp : ffffff83b2ec7970
> > | x29: ffffff83b2ec7970 x28: dfffffd000000000
> > | x27: ffffff83d674f000 x26: dfffffd000000000
> > | x25: ffffffd06b8fa660 x24: dfffffd000000000
> > | x23: 0000000000000000 x22: ffffffd06b7c5108
> > | x21: ffffffd06d597860 x20: ffffffd06e2c21c0
> > | x19: ffffffd06d5974c0 x18: 000000000001dad8
> > | x17: 0000000000000000 x16: dfffffd000000000
> > | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70
> > | x15: ffffffffffffffff x14: 79616c6564203a74
> > | x13: 0000000000000000 x12: 0000000000000080
> > | Write of size 8 at addr ffffff83d95768d0 by task kworker/3:1/150
> > | x11: 0000000000000001 x10: 0000000000000000
> > | x9 : fc9e8edec0ad0300 x8 : fc9e8edec0ad0300
> > |
> > | x7 : 0000000000000000 x6 : 0000000000000000
> > | x5 : 0000000000000080 x4 : 0000000000000000
> > | CPU: 3 PID: 150 Comm: kworker/3:1 Tainted: G        W         5.4.147-lockdep #1
> > | x3 : ffffffd06c149574 x2 : ffffff83f77f7498
> > | x1 : ffffffd06d596f60 x0 : 0000000000000061
> > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT)
> > | Call trace:
> > |  debug_print_object+0x13c/0x1b0
> > | Workqueue: events kobject_delayed_cleanup
> > |  __debug_check_no_obj_freed+0x25c/0x3c0
> > |  debug_check_no_obj_freed+0x18/0x20
> > | Call trace:
> > |  slab_free_freelist_hook+0xb4/0x1bc
> > |  kfree+0xe8/0x2d8
> > |  dump_backtrace+0x0/0x27c
> 
> Why is dump_backtrace in the callstack here inbetween
> rpmsg_ctrldev_release_device() and kfree()? Isn't the error that we're
> calling kfree() on an chunk of memory that contains a live object?

When I tried to repro there was no dump_backtrace():

  Call trace:
   debug_print_object+0x13c/0x1b0
   __debug_check_no_obj_freed+0x25c/0x3c0
   debug_check_no_obj_freed+0x18/0x20
   slab_free_freelist_hook+0xbc/0x1e4
   kfree+0xfc/0x2f4
   rpmsg_ctrldev_release_device+0x78/0xb8
   device_release+0x84/0x168
   kobject_cleanup+0x12c/0x298
   kobject_delayed_cleanup+0x10/0x18
   process_one_work+0x578/0x92c
   worker_thread+0x804/0xcf8
   kthread+0x2a8/0x314
   ret_from_fork+0x10/0x18

My guess is that Sujit added a dump_backtrace() for debugging and it was
still there when the backtrace of the commit message was generated. That
would also explain the two 'Call trace:' entries in the log.

> > |  rpmsg_ctrldev_release_device+0x78/0xb8
> > |  device_release+0x68/0x14c
> > |  show_stack+0x20/0x2c
> > |  kobject_cleanup+0x12c/0x298
> > |  kobject_delayed_cleanup+0x10/0x18
> > |  dump_stack+0xe0/0x19c
> > |  process_one_work+0x578/0x92c
> > |  worker_thread+0x804/0xcf8
> > |  print_address_description+0x3c/0x4a8
> > |  kthread+0x2a8/0x314
> > |  ret_from_fork+0x10/0x18
> > |  __kasan_report+0x100/0x124
> > 
> > Signed-off-by: Sujit Kautkar <sujitka@...omium.org>
> > ---
> > Changes in v3:
> > - Remove unecessary error check as per Matthias's comment
> > 
> > Changes in v2:
> > - Fix typo in commit message
> > 
> >  drivers/rpmsg/rpmsg_char.c | 10 ++--------
> >  1 file changed, 2 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
> > index 876ce43df732b..a6a33155ca859 100644
> > --- a/drivers/rpmsg/rpmsg_char.c
> > +++ b/drivers/rpmsg/rpmsg_char.c
> > @@ -458,7 +458,7 @@ static void rpmsg_ctrldev_release_device(struct device *dev)
> >  
> >  	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> >  	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> > -	cdev_del(&ctrldev->cdev);
> > +	cdev_device_del(&ctrldev->cdev, &ctrldev->dev);
> 
> I am not able to find any other instance where cdev_device_del() is
> called from the device's release function itself, which tells me that
> this probably is not the right thing to do. Instead the appropriate way
> seem to put the cdev_device_del() in rpmsg_chrdev_remove().

Yes, it sounds reasonable to me to delete the char device when the control
device is removed.

> That said, we already do device_del() in rpmsg_chrdev_remove() so if the
> warning is trying to tell us that ctrldev->dev is not deleted I think we
> have an unbalanced put_device()?

My understanding is that the situation is analogous to this one:

commit 1413ef638abae4ab5621901cf4d8ef08a4a48ba6
Author: Kevin Hao <haokexin@...il.com>
Date:   Fri Oct 11 23:00:14 2019 +0800

    i2c: dev: Fix the race between the release of i2c_dev and cdev

    The struct cdev is embedded in the struct i2c_dev. In the current code,
    we would free the i2c_dev struct directly in put_i2c_dev(), but the
    cdev is manged by a kobject, and the release of it is not predictable.
    So it is very possible that the i2c_dev is freed before the cdev is
    entirely released. We can easily get the following call trace with
    CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS_TIMERS enabled.
      ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38
      WARNING: CPU: 19 PID: 1 at lib/debugobjects.c:325 debug_print_object+0xb0/0xf0

    ...

    This is a common issue when using cdev embedded in a struct.
    Fortunately, we already have a mechanism to solve this kind of issue.
    Please see commit 233ed09d7fda ("chardev: add helper function to
    egister char devs with a struct device") for more detail.

    In this patch, we choose to embed the struct device into the i2c_dev,
    and use the API provided by the commit 233ed09d7fda to make sure that
    the release of i2c_dev and cdev are in sequence.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ