linux-kernel - Re: 3.13-1 dm cache possible race condition

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20140603185014.GA8627@redhat.com>
Date:	Tue, 3 Jun 2014 14:50:15 -0400
From:	Mike Snitzer <snitzer@...hat.com>
To:	roma1390 <roma1390@...il.com>
Cc:	linux-kernel@...r.kernel.org,
	Morgan Mears <Morgan.Mears@...app.com>,
	Mikulas Patocka <mpatocka@...hat.com>,
	Joe Thornber <ejt@...hat.com>,
	Heinz Mauelshagen <heinzm@...hat.com>, dm-devel@...hat.com
Subject: Re: 3.13-1 dm cache possible race condition

[please cc dm-devel rather than, or in addition to, LKML in the future]

On Sun, May 18 2014 at 11:35am -0400,
roma1390 <roma1390@...il.com> wrote:

> I think that somehow is got broken cache->nr_dirty
> 
> # dmsetup status foo0
> 0 32768 cache 7/4096 1728240 256 1675650 0 0 64 64 4294967295 1
> writeback 2 migration_threshold 2048 4 random_threshold 4
> sequential_threshold 512
> 
> 
> See: 4294967295 this is -1 as is not OK.
> 
> 
> Kernel: Debian stock 3.13-1-amd64
> 
> Actions taken:
> 
>  modprobe brd
>  BLOCKS=$[`blockdev --getsize64 /dev/ram0`/512]
>  METADATA_DEV=/dev/ram0
>  CACHE_DEV=/dev/ram1
>  DATA_DEV=/dev/ram2
>  dmsetup create foo0 --table "0 $BLOCKS cache $METADATA_DEV $CACHE_DEV
> $DATA_DEV 512 1 writeback default 0"

Why are you limiting the dm-cache's DM table to $BLOCKS of the metadata
device (/dev/ram0)?  You should be using the DATA_DEV for the size of
the cache table.

Anyway, based on the below dmsetup status output I can infer that your
metadata device is only 16MB.  But given that you're limiting the origin
size to that 16MB and you're using a cache blocksize of 512 sectors
(256K) there really only needs to be 64 cache blocks to cover the entire
origin device with cache.

(BTW, easier to just use blockdev --getsize since DM expects units of
512b sectors)

> Test:
> one terminal window:
>   while true; do dd if=/dev/zero of=/dev/mapper/foo0 bs=512; done
> second window:
>   while sleep .1; do dmsetup status foo0; done
> 
> 
> after some time from 0 i get to 4294967295, which is think is not
> expected value.
> 
> 
> More info:
> device just created:
> 0 32768 cache 10/4096 1728259 256 1675650 0 0 0 64 0 1 writeback 2
> migration_threshold 2048 4 random_threshold 4 sequential_threshold 512

...

> 0 32768 cache 10/4096 2737453 256 2679623 0 0 0 64 4294967295 1
> writeback 2 migration_threshold 2048 4 random_threshold 4
> sequential_threshold 512

You clearly are experiencing some bug, there is no way you have that
many cache blocks.  nr_dirty should always be bound by the number of
cache blocks in the cache.  So in your case it should be limited to 64
(if I did my math above properly).

The newer DM cache versions (in 3,14 and above) provide more useful
status.  But unfortunately, with the older status output, I cannot infer
from the provided status output how large the cache really is.

Anyway, I suspect something odd is happening due to user error.  Doesn't
mean there isn't a bug.. just helps explain why we haven't seen this.

Will try to reproduce.  But in the meantime if you could retry with
>= 3.14 and clearly show the "dmsetup table" (not the shell that creates
it) that'd be helpful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/