[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2023033124-causing-cassette-4d96@gregkh>
Date: Fri, 31 Mar 2023 18:32:27 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
Cc: LKML <linux-kernel@...r.kernel.org>,
Thorsten Leemhuis <regressions@...mhuis.info>,
Maxim Levitsky <maximlevitsky@...il.com>,
Alex Dubov <oakad@...oo.com>,
Ulf Hansson <ulf.hansson@...aro.org>,
Jens Axboe <axboe@...nel.dk>,
Christophe JAILLET <christophe.jaillet@...adoo.fr>,
Hannes Reinecke <hare@...e.de>,
Jiasheng Jiang <jiasheng@...as.ac.cn>,
ye xingchen <ye.xingchen@....com.cn>, linux-mmc@...r.kernel.org
Subject: Re: BUG FIX: [PATCH RFC v2] memstick_check() memleak in kernel
6.1.0+ introduced pre 4.17
On Fri, Mar 31, 2023 at 04:46:03PM +0200, Mirsad Goran Todorovac wrote:
> On 29.3.2023. 19:25, Mirsad Goran Todorovac wrote:
> > On 23.12.2022. 14:20, Mirsad Goran Todorovac wrote:
> > > Hi all,
> > >
> > > When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled CONFIG_DEBUG_KMEMLEAK=y
> > > and the result showed an unreferenced object in kworker process:
> > >
> > > cat /sys/kernel/debug/kmemleak
> > > unreferenced object 0xffff888105028d80 (size 16):
> > > comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
> > > hex dump (first 16 bytes):
> > > 6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
> > > backtrace:
> > > [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
> > > [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
> > > [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
> > > [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
> > > [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
> > > [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
> > > [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
> > > [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
> > > [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
> > > [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
> > > [<ffffffffb676e556>] worker_thread+0x76/0x770
> > > [<ffffffffb677b468>] kthread+0x168/0x1a0
> > > [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
> > >
> > > mtodorov@...ac:~/linux/kernel/linux_stable$ git bisect log
> > > git bisect start
> > > # bad: [f0c4d9fc9cc9462659728d168387191387e903cc] Linux 6.1-rc4
> > > git bisect bad f0c4d9fc9cc9462659728d168387191387e903cc
> > > # bad: [fbd56ddcecab5a3623a89c8e941fdbcc55b41045] Linux 6.0.1
> > > git bisect bad fbd56ddcecab5a3623a89c8e941fdbcc55b41045
> > > # bad: [7e18e42e4b280c85b76967a9106a13ca61c16179] Linux 6.0-rc4
> > > git bisect bad 7e18e42e4b280c85b76967a9106a13ca61c16179
> > > # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
> > > git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
> > > # bad: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> > > git bisect bad 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
> > > # bad: [94710cac0ef4ee177a63b5227664b38c95bbf703] Linux 4.18
> > > git bisect bad 94710cac0ef4ee177a63b5227664b38c95bbf703
> > > # bad: [29dcea88779c856c7dc92040a0c01233263101d4] Linux 4.17
> > > git bisect bad 29dcea88779c856c7dc92040a0c01233263101d4
> > >
> > > Greg asked me if I would help bisect the bug, since I failed to
> > > reproduce it on pre 4.17 kernels, because they wouldn't boot (black
> > > screen) on the Lenovo ALmaLinux 8.7 (CentOS fork) desktop box that
> > > only reproduced that bug:
> > >
> > > product: 10TX000VCR (LENOVO_MT_10TX_BU_Lenovo_FM_V530S-07ICB)
> > > vendor: LENOVO
> > > version: V530S-07ICB
> > >
> > > I would welcome any advice.
> > >
> > > Please find attached the lshw output and the build config from the
> > > last kernel version that also exhibits this bug, so the conclusion
> > > is that it is not fixed since the report on November 29th 2022:
> > >
> > > https://lore.kernel.org/regressions/0d9c3f6c-3948-d5d1-bcc1-baf31141beaa@alu.unizg.hr/T/#t
> > >
> > > With the hint of Tvrtko, I was able to extract the correct list of maintainers this time.
> > >
> > > The bug occurs in one kernel memory leak, and it is unobvious
> > > whether a skilled attacker could use an abusive program to trigger
> > > the leak of enough 16 byte slabs (and overhead) to exhaust kernel
> > > memory and cause denial-of-service (crash of the system).
> > >
> > > I apologise for the first unsuccessful attempt.
> >
> > static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> >
> > calls dev_set_name(&card->dev, "%s", dev_name(&host->dev)); that
> > calls err = kobject_set_name_vargs(&dev->kobj, fmt, vargs); that
> > executes:
> >
> > if (strchr(s, '/')) {
> > char *t;
> >
> > t = kstrdup(s, GFP_KERNEL);
> > kfree_const(s);
> > if (!t)
> > return -ENOMEM;
> > strreplace(t, '/', '!');
> > s = t;
> > }
> > kfree_const(kobj->name);
> > kobj->name = s;
> >
> > so, this kobj->name was never freed in the "goto err_out" case in line 404.
> >
> > 380 static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> > 381 {
> > 382 struct memstick_dev *card = kzalloc(sizeof(struct memstick_dev),
> > 383 GFP_KERNEL);
> > 384 struct memstick_dev *old_card = host->card;
> > 385 struct ms_id_register id_reg;
> > 386
> > 387 if (card) {
> > 388 card->host = host;
> > 389 dev_set_name(&card->dev, "%s", dev_name(&host->dev));
> > 390 card->dev.parent = &host->dev;
> > 391 card->dev.bus = &memstick_bus_type;
> > 392 card->dev.release = memstick_free_card;
> > 393 card->check = memstick_dummy_check;
> > 394
> > 395 card->reg_addr.r_offset = offsetof(struct ms_register, id);
> > 396 card->reg_addr.r_length = sizeof(id_reg);
> > 397 card->reg_addr.w_offset = offsetof(struct ms_register, id);
> > 398 card->reg_addr.w_length = sizeof(id_reg);
> > 399
> > 400 init_completion(&card->mrq_complete);
> > 401
> > 402 host->card = card;
> > 403 if (memstick_set_rw_addr(card))
> > 404 goto err_out;
> > 405
> > 406 card->next_request = h_memstick_read_dev_id;
> > 407 memstick_new_req(host);
> > 408 wait_for_completion(&card->mrq_complete);
> > 409
> > 410 if (card->current_mrq.error)
> > 411 goto err_out;
> > 412 }
> > 413 host->card = old_card;
> > 414 return card;
> > 415 err_out:
> > 416 host->card = old_card;
> > 421 kfree(card);
> > 422 return NULL;
> > 423 }
> >
> > This little patch fixes it, also at the release() method.
> >
> > However, release() had not yet been tested, and I am not certain that in that case
> > kobj->name would not be kfree_const()-ed automatically.
> >
> > Maybe it ought to be separated in two independent patches?
> >
> > diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> > index bf7667845459..403ab06e3ffa 100644
> > --- a/drivers/memstick/core/memstick.c
> > +++ b/drivers/memstick/core/memstick.c
> > @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
> > {
> > struct memstick_dev *card = container_of(dev, struct memstick_dev,
> > dev);
> > + if ((card->dev).kobj.name) {
> > + kfree_const((card->dev).kobj.name);
> > + (card->dev).kobj.name = NULL;
> > + }
> > kfree(card);
> > }
> >
> > @@ -410,6 +414,10 @@ static struct memstick_dev *memstick_alloc_card(struct memstick_host *host)
> > return card;
> > err_out:
> > host->card = old_card;
> > + if ((card->dev).kobj.name) {
> > + kfree_const((card->dev).kobj.name);
> > + (card->dev).kobj.name = NULL;
> > + }
> > kfree(card);
> > return NULL;
> > }
> >
> > This morning I did not feel like we'd fix two memory leaks today.
> >
> > This one was a nag for three months :-)
> >
> > Of course, this requires peer review. The fact that it fixed the /sys/kernel/debug/kmemleak
> > output doesn't mean that it wouldn't break something, does it?
> >
> > It is clearly the good wind of the Providence.
>
> This is the second version of the patch, without the paranoid parentheses.
>
> I am still in the process of convincing Thunderbird not to convert tabs to
> spaces, so please use --ignore-whitespace when testing this patch. :-(
>
> ---
> drivers/memstick/core/memstick.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
> index bf7667845459..390287c23f27 100644
> --- a/drivers/memstick/core/memstick.c
> +++ b/drivers/memstick/core/memstick.c
> @@ -191,6 +191,10 @@ static void memstick_free_card(struct device *dev)
> {
> struct memstick_dev *card = container_of(dev, struct memstick_dev,
> dev);
> + if (card->dev.kobj.name) {
> + kfree_const(card->dev.kobj.name);
Ick, no, please don't mess around with a kobject name from within a
driver like this. That's indicitave that something else is really
wrong.
Yes, the nvme core code does it, but it shouldn't.
Hm, the driver core does it in two places too, that's not good, I'll
look at fixing that up too.
This patch is implying that anyone who calls "dev_set_name()" also has
to do this hack, which shouldn't be the case at all.
thanks,
greg k-h
Powered by blists - more mailing lists