lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200428070651.qbsyivvaakflipr4@mpHalley.localdomain>
Date:   Tue, 28 Apr 2020 09:06:51 +0200
From:   Javier González <javier@...igon.com>
To:     Niklas Cassel <Niklas.Cassel@....com>
Cc:     Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        Igor Konopko <igor.j.konopko@...el.com>,
        Matias Bjørling <mb@...htnvm.io>,
        Jens Axboe <axboe@...nel.dk>,
        "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] nvme: prevent double free in nvme_alloc_ns() error
 handling

On 27.04.2020 18:22, Niklas Cassel wrote:
>On Mon, Apr 27, 2020 at 08:03:11PM +0200, Javier González wrote:
>> On 27.04.2020 14:34, Niklas Cassel wrote:
>> > When jumping to the out_put_disk label, we will call put_disk(), which will
>> > trigger a call to disk_release(), which calls blk_put_queue().
>> >
>> > Later in the cleanup code, we do blk_cleanup_queue(), which will also call
>> > blk_put_queue().
>> >
>> > Putting the queue twice is incorrect, and will generate a KASAN splat.
>> >
>> > Set the disk->queue pointer to NULL, before calling put_disk(), so that the
>> > first call to blk_put_queue() will not free the queue.
>> >
>> > The second call to blk_put_queue() uses another pointer to the same queue,
>> > so this call will still free the queue.
>> >
>> > Fixes: 85136c010285 ("lightnvm: simplify geometry enumeration")
>> > Signed-off-by: Niklas Cassel <niklas.cassel@....com>
>> > ---
>> > drivers/nvme/host/core.c | 2 ++
>> > 1 file changed, 2 insertions(+)
>> >
>> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> > index 91c1bd659947..f2adea96b04c 100644
>> > --- a/drivers/nvme/host/core.c
>> > +++ b/drivers/nvme/host/core.c
>> > @@ -3642,6 +3642,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
>> >
>> >       return;
>> >  out_put_disk:
>> > +      /* prevent double queue cleanup */
>> > +      ns->disk->queue = NULL;
>> >       put_disk(ns->disk);
>> >  out_unlink_ns:
>> >       mutex_lock(&ctrl->subsys->lock);
>> > --
>> > 2.25.3
>> >
>> What about delaying the assignment of ns->disk?
>>
>> diff --git i/drivers/nvme/host/core.c w/drivers/nvme/host/core.c
>> index a4d8c90ee7cc..6da4a9ced945 100644
>> --- i/drivers/nvme/host/core.c
>> +++ w/drivers/nvme/host/core.c
>> @@ -3541,7 +3541,6 @@ static int nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
>>         disk->queue = ns->queue;
>>         disk->flags = flags;
>>         memcpy(disk->disk_name, disk_name, DISK_NAME_LEN);
>> -       ns->disk = disk;
>>
>>         __nvme_revalidate_disk(disk, id);
>>
>> @@ -3553,6 +3552,8 @@ static int nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
>>                 }
>>         }
>>
>> +       ns->disk = disk;
>> +
>
>Hello Javier!
>
>
>The only case where we jump to the out_put_disk label, is if the
>nvme_nvm_register() call failed.
>
>In that case, we want to undo the alloc_disk_node() operation, i.e.,
>decrease the refcount.
>
>If we don't set "ns->disk = disk;" before the call to nvme_nvm_register(),
>then, if register fails, and we jump to the put_disk(ns->disk) label,
>ns->disk will be NULL, so the recount will not be decreased, so I assume
>that this memory would then be a memory leak.
>
>
>I think that the problem is that the block functions are a bit messy.
>Most drivers seem to do blk_cleanup_queue() first and then do put_disk(),
>but some drivers do it in the opposite way, so I think that we might have
>some more use-after-free bugs in some of these drivers that do it in the
>opposite way.
>

Hi Niklas,

Yes, the out_put_disk label was introduced at the same time as the
LightNVM entry point. We can do a better job at separating the cleanup
functions, but as far as I can see ns->disk is not used in the LightNVM
initialization, so delaying the initialization should be ok. Part of
this should be also changing the out_put_disk to put_disk(disk).

Note that initializing other namespace types here do not require
ns->disk either, so delaying initialization should be ok. We have been
running with this patch locally for some time.

This said, this is an alternative as your fix works.

Javier

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ