linux-ext4 - Custom driver FS brokenness at 4GB?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <5565CD0D.4080408@gmail.com>
Date:	Wed, 27 May 2015 09:56:29 -0400
From:	Rob Harris <rob.harris@...il.com>
To:	linux-ext4@...r.kernel.org
Subject: Custom driver FS brokenness at 4GB?

Greetings. I have an odd issue and need some ideas of where to go next 
-- I'm out of hair to rip out.

I'm writing a custom block device driver talking to some custom RAID 
hardware (>32TB) using DMA scatter-gather, with no partitions and am 
using make_request() to service all the BIO requests to simplify 
debugging. I have the driver working to the point where using DD against 
the block device seems to work fine (I'm setting iflag|oflag=direct to 
ensure it's writing to the disk). I also have the blk_queue set to only 
request a single 4k I/O per BIO (again to simplify debugging for now.) 
Also, again to debug, I have a mutex wrapping the entire make_request 
call to ensure that only a single request is being serviced at a time. 
So, this should be as "simple" as I can make the environment to debug 
this problem.

Once the driver is loaded, when I try to create a file system (ext4 but 
the same thing happens with xfs) it seems like there is some corruption 
occurring, but only when I set the sector size of the block device over 
4GB. For instance, when I set the size to 4G, I can mkfs.ext4, but after 
2 or 3 mount/umounts the FS refuses to mount anymore and the kernel log 
complains that the journal is missing. This was discovered running this 
loop...

#!/bin/sh
COUNT=4032

while [ 1 ] ; do

figlet ${COUNT}

( umount /mnt ; rmmod smc ) || true
modprobe smc capacity_in_mb=${COUNT} debug=1
mkfs.ext4 -m 0 /dev/smcd

mount /dev/smcd /mnt
cp count_512m.dat /mnt/test
umount /mnt
mount /dev/smcd /mnt
umount /mnt
mount /dev/smcd /mnt
cmp count_512m.dat /mnt/test
umount /mnt
mount /dev/smcd /mnt # ***
sync
umount /mnt
mount /dev/smcd /mnt
sleep 1
umount /mnt

COUNT=$(( COUNT + 64 ))
sleep 1

done

Sometimes I'll get in the kernel log:
May 27 09:39:01 febtober kernel: [64547.304695] EXT4-fs (smcd): 
ext4_check_descriptors: Checksum for group 0 failed (7009!=0)
May 27 09:39:01 febtober kernel: [64547.305744] EXT4-fs (smcd): group 
descriptors corrupted!

Others I'll get:
May 27 09:46:49 ryftone-smcdrv kernel: [65014.342850] EXT4-fs (smcd): no 
journal found

I've seen this loop fail as early as COUNT=4096, but as late as 
COUNT=4220; removing the sync changes the behavior.
When it fails, it usually does so on the 3rd mount (***).
FYI, I effectively call: set_capacity( disk, capacity_in_mb * 2048 ); ( 
2048 * 512b (kernel sector) = 1M )

Another example: if I set the sector count of the disk to 16G, I can run 
mkfs.ext4 but the first mount fails and I see May 27 09:07:27 febtober 
kernel: [62653.269387] EXT4-fs (smcd): ext4_check_descriptors: Block 
bitmap for group 0 not in group (block 4294967295)!

But, again, if I set the sector size < 4G, everything seems fine. I can 
currently DD read and write across that 4G boundary without issue -- 
it's ONLY the filesystem accesses. My gut is screaming there's 32/64 bit 
overflow condition somewhere but for the life of me I can't find it. Is 
there something I need to set to tell the block layer I have a 64-bit 
addressible device? set_capacity is always the number of LINUX KERNEL 
sectors (not what I set blk_queue_logical|physical_block_size to) correct?

I'm currently on 3.16.0 (Ubuntu 14.04.2 LTS) if it matters.

Any help/pointers would be greatly appreciated.

--Rob Harris

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html