[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875y2gi33n.ffs@tglx>
Date: Sun, 05 Nov 2023 17:20:28 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Ben Greear <greearb@...delatech.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Cc: Rodolfo Giometti <giometti@...eenne.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH/RFC] debugobjects/slub: Print slab info and backtrace.
On Thu, Nov 02 2023 at 18:49, Ben Greear wrote:
> And here is resulting splat from wireless-next tree I've been
> debugging.
>
> Note the subsequent splats from slub are due to some memory poisoning, for
> one reason or another. Maybe slub changes should not be included in this patch, not
> sure if it can provide useful info in other cases though.
>
> If I understand this correctly, then it appears the bug is related to
> the pps driver.
>
> 16140 Nov 02 17:28:25 ct523c-2103 kernel: ODEBUG: debugobjects: debug_obj allocated at:
> 16141 Nov 02 17:28:25 ct523c-2103 kernel: init_timer_key+0x24/0x160
> 16142 Nov 02 17:28:25 ct523c-2103 kernel: kobject_put+0x14f/0x190
> 16143 Nov 02 17:28:25 ct523c-2103 kernel: pps_device_destruct+0x26/0xb0
> 16144 Nov 02 17:28:25 ct523c-2103 kernel: device_release+0x57/0x100
> 16145 Nov 02 17:28:25 ct523c-2103 kernel: kobject_delayed_cleanup+0xdf/0x140
> 16146 Nov 02 17:28:25 ct523c-2103 kernel: process_one_work+0x475/0x920
> 16147 Nov 02 17:28:25 ct523c-2103 kernel: worker_thread+0x38a/0x680
Can you please provide proper kernel dmesg output next time instead of
this mess?
> ODEBUG: free active (active state 0) object: ffff888181c029a0 object type: timer_list hint: kobject_delayed_cleanup+0x0/0x140
> WARNING: CPU: 1 PID: 104 at lib/debugobjects.c:549 debug_print_object+0xf0/0x170
> CPU: 1 PID: 104 Comm: kworker/1:10 Tainted: G W 6.6.0-rc7+ #17
> Workqueue: events kobject_delayed_cleanup
> RIP: 0010:debug_print_object+0xf0/0x170
> debug_check_no_obj_freed+0x261/0x2b0
> __kmem_cache_free+0x185/0x200
> device_release+0x57/0x100
> kobject_delayed_cleanup+0xdf/0x140
> process_one_work+0x475/0x920
> worker_thread+0x38a/0x680
So what happens is:
pps_unregister_cdev()
device_destroy()
put_device()
device_unregister()
device_del()
put_device() <- Drops final reference to dev->kobj
schedule_delayed_work()
worker thread:
kobject_delayed_cleanup()
device_release()
pps_device_destruct()
cdev_del(&pps->cdev)
kobject_put(&cdev->kobj) <- Drops final reference
schedule_delayed_work()
init_timer(&cdev->kobj.release.timer);
start_timer();
...
kfree(dev);
kfree(pps); <- Debug object detects the active timer to be freed
because cdev and its kobject are embedded in
struct pps_device.
pps_device_destruct() is unfortunately not on the call trace of the
debug objects splat anymore stack because kfree(pps) is a tail call.
So yes, that collected stacktrace is helpful.
>> To try to improve this, store the backtrace of where the
>> debug_obj was created and print that out when problems
>> are found.
<SNIP>
Please trim your replies.
Thanks,
tglx
Powered by blists - more mailing lists