lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 6 Jun 2016 12:09:49 -0500
From:	Shaun Tancheff <shaun.tancheff@...gate.com>
To:	Catalin Marinas <catalin.marinas@....com>
Cc:	Christoph Hellwig <hch@....de>, linux-block@...r.kernel.org,
	LKML <linux-kernel@...r.kernel.org>, Jens Axboe <axboe@...com>,
	Larry.Finger@...inger.net, bart.vanassche@...disk.com,
	drysdale@...gle.com
Subject: Re: kmemleak report after 9082e87bfbf8 ("block: remove struct bio_batch")

On Mon, Jun 6, 2016 at 11:12 AM, Catalin Marinas
<catalin.marinas@....com> wrote:
>
> On Mon, Jun 06, 2016 at 04:13:34PM +0200, Christoph Hellwig wrote:
> > I've got a few reports of this over the weekend, but it still doesn't
> > make much sense to me.
> >
> > Could it be that kmemleak can't deal with the bio_batch logic?  I've
> > tried to look at the various bio and biovec number entries in
> > /proc/slabinfo, and while they keep changing a bit during the
> > system runtime there doesn't seem to be a persistent increase
> > even after lots of mkfs calls.
>
> I think the reported leaks settle after about 10-20min (2-3 kmemleak
> periodic scans), so checking /proc/slabinfo may not be sufficient if
> the leak is not growing. The leaks also do not seem to disappear,
> otherwise kmemleak would no longer report them (e.g. after kfree, even
> if they had been previously reported).
>
> What kmemleak reports is objects for which it cannot find a pointer (to
> anywhere inside that object; e.g. list_heads are handled). False
> positives are indeed present sometimes but for cases where pointers are
> stored in non-tracked objects like alloc_pages().
>
> It seems that this leak reports always come in pairs. The first one:
>
> unreferenced object 0xffff880262cbe900 (size 256):
>   comm "NetworkManager", pid 516, jiffies 4294895670 (age 2479.340s)
>   hex dump (first 32 bytes):
>     00 00 00 00 00 00 00 00 c0 f3 ab 8a 00 88 ff ff  ................
>     02 20 00 20 00 00 00 00 11 00 00 00 00 00 00 00  . . ............
>   backtrace:
>     [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
>     [<ffffffff81175b42>] mempool_alloc+0x72/0x190
>     [<ffffffff812dc9f6>] bio_alloc_bioset+0xb6/0x240
>     [<ffffffff812eca7f>] next_bio+0x1f/0x50
>     [<ffffffff812ecf1a>] blkdev_issue_zeroout+0xea/0x1d0
>     [<ffffffffc025b610>] ext4_issue_zeroout+0x40/0x50 [ext4]
>     [<ffffffffc028c58d>] ext4_ext_map_blocks+0x144d/0x1bb0 [ext4]
>     [<ffffffff811839f4>] release_pages+0x254/0x310
>     [<ffffffff8118437a>] __pagevec_release+0x2a/0x40
>     [<ffffffffc025b297>] mpage_prepare_extent_to_map+0x227/0x2c0 [ext4]
>     [<ffffffffc025b793>] ext4_map_blocks+0x173/0x5d0 [ext4]
>     [<ffffffffc025f1d0>] ext4_writepages+0x700/0xd40 [ext4]
>     [<ffffffff8121312e>] legitimize_mnt+0xe/0x50
>     [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
>     [<ffffffff81173fd5>] __filemap_fdatawrite_range+0xc5/0x100
>     [<ffffffff81174113>] filemap_write_and_wait_range+0x33/0x70
>
> is the first mempool_alloc() in bio_alloc_bioset() for struct bio and
> front_pad.
>
> The second report:
>
> unreferenced object 0xffff880036488600 (size 256):
>   comm "NetworkManager", pid 516, jiffies 4294895670 (age 2479.348s)
>   hex dump (first 32 bytes):
>     80 39 08 00 00 ea ff ff 00 10 00 00 00 00 00 00  .9..............
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
>     [<ffffffff812dc8b7>] bvec_alloc+0x57/0xe0
>     [<ffffffff812dcaaf>] bio_alloc_bioset+0x16f/0x240
>     [<ffffffff812eca7f>] next_bio+0x1f/0x50
>     [<ffffffff812ecf1a>] blkdev_issue_zeroout+0xea/0x1d0
>     [<ffffffffc025b610>] ext4_issue_zeroout+0x40/0x50 [ext4]
>     [<ffffffffc028c58d>] ext4_ext_map_blocks+0x144d/0x1bb0 [ext4]
>     [<ffffffff811839f4>] release_pages+0x254/0x310
>     [<ffffffff8118437a>] __pagevec_release+0x2a/0x40
>     [<ffffffffc025b297>] mpage_prepare_extent_to_map+0x227/0x2c0 [ext4]
>     [<ffffffffc025b793>] ext4_map_blocks+0x173/0x5d0 [ext4]
>     [<ffffffffc025f1d0>] ext4_writepages+0x700/0xd40 [ext4]
>     [<ffffffff8121312e>] legitimize_mnt+0xe/0x50
>     [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
>     [<ffffffff81173fd5>] __filemap_fdatawrite_range+0xc5/0x100
>     [<ffffffff81174113>] filemap_write_and_wait_range+0x33/0x70
>
> is for the struct bio_vec allocation in bvec_alloc() (the one going via
> kmem_cache_alloc).
>
> IIUC, the bio object above allocated via next_bio() ->
> bio_alloc_bioset() is returned to __blkdev_issue_zeroout() which
> eventually submits them either directly for the last one or via
> next_bio().
>
> Regarding bio chaining, I can't figure out what the first bio allocated
> in __blkdev_issue_zeroout() is chained to since bio == NULL initially.
> Subsequent next_bio() allocations are linked to the previous ones via
> bio_chain() but somehow kmemleak loses track of the first one, hence the
> subsequent bios are reported as leaks. That's unless the chaining should
> be the other way around:
>
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 23d7f301a196..3bf78b7b74cc 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -15,7 +15,7 @@ static struct bio *next_bio(struct bio *bio, int rw, unsigned int nr_pages,
>         struct bio *new = bio_alloc(gfp, nr_pages);
>
>         if (bio) {
> -               bio_chain(bio, new);
> +               bio_chain(new, bio);
>                 submit_bio(rw, bio);
>         }
>
>
> Also confusing is that chaining is done via bio->bi_private, however
> this is overridden in other places like submit_bio_wait().
>
> However, since I don't fully understand this code, this chaining may not
> even be essential to struct bio freeing (and I'm investigating the wrong
> path).
>
> --
> Catalin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=CwIBAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=Wg5NqlNlVTT7Ugl8V50qIHLe856QW0qfG3WVYGOrWzA&m=uHYaMtk0CBsO1MWg4alX8kHC6bvXsmWCR9wL8nrX28E&s=4EHwF1z6acQYr9R052hraaTE1KUf6IiXcEmd-CSLV48&e=


I'm pretty sure it is missing a bio_put() after submit_bio_wait().

Please excuse the hack-y patch but I think you need to do something
like this ...
(Note tabs eaten by gmail).

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 23d7f30..9e29dc3 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -113,6 +113,7 @@ int blkdev_issue_discard(struct block_device
*bdev, sector_t sector,
                ret = submit_bio_wait(type, bio);
                if (ret == -EOPNOTSUPP)
                        ret = 0;
+               bio_put(bio);
        }
        blk_finish_plug(&plug);

@@ -165,8 +166,10 @@ int blkdev_issue_write_same(struct block_device
*bdev, sector_t sector,
                }
        }

-       if (bio)
+       if (bio) {
                ret = submit_bio_wait(REQ_WRITE | REQ_WRITE_SAME, bio);
+               bio_put(bio);
+       }
        return ret != -EOPNOTSUPP ? ret : 0;
 }
 EXPORT_SYMBOL(blkdev_issue_write_same);
@@ -206,8 +209,11 @@ static int __blkdev_issue_zeroout(struct
block_device *bdev, sector_t sector,
                }
        }

-       if (bio)
-               return submit_bio_wait(WRITE, bio);
+       if (bio) {
+               ret = submit_bio_wait(WRITE, bio);
+               bio_put(bio);
+               return ret;
+       }
        return 0;
 }



-- 
Shaun Tancheff

Powered by blists - more mailing lists