[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140603185014.GA8627@redhat.com>
Date: Tue, 3 Jun 2014 14:50:15 -0400
From: Mike Snitzer <snitzer@...hat.com>
To: roma1390 <roma1390@...il.com>
Cc: linux-kernel@...r.kernel.org,
Morgan Mears <Morgan.Mears@...app.com>,
Mikulas Patocka <mpatocka@...hat.com>,
Joe Thornber <ejt@...hat.com>,
Heinz Mauelshagen <heinzm@...hat.com>, dm-devel@...hat.com
Subject: Re: 3.13-1 dm cache possible race condition
[please cc dm-devel rather than, or in addition to, LKML in the future]
On Sun, May 18 2014 at 11:35am -0400,
roma1390 <roma1390@...il.com> wrote:
> I think that somehow is got broken cache->nr_dirty
>
> # dmsetup status foo0
> 0 32768 cache 7/4096 1728240 256 1675650 0 0 64 64 4294967295 1
> writeback 2 migration_threshold 2048 4 random_threshold 4
> sequential_threshold 512
>
>
> See: 4294967295 this is -1 as is not OK.
>
>
> Kernel: Debian stock 3.13-1-amd64
>
> Actions taken:
>
> modprobe brd
> BLOCKS=$[`blockdev --getsize64 /dev/ram0`/512]
> METADATA_DEV=/dev/ram0
> CACHE_DEV=/dev/ram1
> DATA_DEV=/dev/ram2
> dmsetup create foo0 --table "0 $BLOCKS cache $METADATA_DEV $CACHE_DEV
> $DATA_DEV 512 1 writeback default 0"
Why are you limiting the dm-cache's DM table to $BLOCKS of the metadata
device (/dev/ram0)? You should be using the DATA_DEV for the size of
the cache table.
Anyway, based on the below dmsetup status output I can infer that your
metadata device is only 16MB. But given that you're limiting the origin
size to that 16MB and you're using a cache blocksize of 512 sectors
(256K) there really only needs to be 64 cache blocks to cover the entire
origin device with cache.
(BTW, easier to just use blockdev --getsize since DM expects units of
512b sectors)
> Test:
> one terminal window:
> while true; do dd if=/dev/zero of=/dev/mapper/foo0 bs=512; done
> second window:
> while sleep .1; do dmsetup status foo0; done
>
>
> after some time from 0 i get to 4294967295, which is think is not
> expected value.
>
>
> More info:
> device just created:
> 0 32768 cache 10/4096 1728259 256 1675650 0 0 0 64 0 1 writeback 2
> migration_threshold 2048 4 random_threshold 4 sequential_threshold 512
...
> 0 32768 cache 10/4096 2737453 256 2679623 0 0 0 64 4294967295 1
> writeback 2 migration_threshold 2048 4 random_threshold 4
> sequential_threshold 512
You clearly are experiencing some bug, there is no way you have that
many cache blocks. nr_dirty should always be bound by the number of
cache blocks in the cache. So in your case it should be limited to 64
(if I did my math above properly).
The newer DM cache versions (in 3,14 and above) provide more useful
status. But unfortunately, with the older status output, I cannot infer
from the provided status output how large the cache really is.
Anyway, I suspect something odd is happening due to user error. Doesn't
mean there isn't a bug.. just helps explain why we haven't seen this.
Will try to reproduce. But in the meantime if you could retry with
>= 3.14 and clearly show the "dmsetup table" (not the shell that creates
it) that'd be helpful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists