lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG3eYYSNYHaKB8REe_M16tUGC3iNDjLsHES+KvTQXknnWZSGUg@mail.gmail.com>
Date:	Thu, 10 Oct 2013 23:23:11 +0200
From:	Péter András Felvégi <petschy@...il.com>
To:	linux-kernel@...r.kernel.org
Subject: PROBLEM: udf mount takes forever to fail + proposed solution

Hello,

recently I made the mistake trying to mount an unformatted ssd
partition. The mount command 'hang', was unable to kill it. Top showed
the process is in the uninterruptible D state. However, iotop showed
slight activity, about 4M/s read from the disk that noone else used.
This was 100% reproducible. sync froze, too, if was given out after
the mount cmd. When trying to shut down the machine, it didn't stop,
just waited for something to happen.

I narrowed down the problem to the UDF filesystem driver. In
fs/udf/super.c, udf_check_vsd() reads the sectors in a for loop, with
the following exit conditions:
- NSR02 or NSR03 descriptor is found
- the read fails
- vsd->stdIdent[0] == 0

Browsed through the UDF 2.6 spec, ECMA 167 and 119. As I understand,
the descriptors should start at offset 32768, forming a contiguous
sequence. In ECMA 167 it is stated that the sequence is terminated by
an invalid descriptor: unrecorded, or blank (all zeros). However, this
presupposes that the filesystem is UDF.

Since the ssd partition was not formatted, it contained only 0xff
bytes, thus none of the exit conditions were met, and the function
read through the whole, in two passes. The runtime was pathetic, it
took the mount 350 minutes to fail. I have no clue why this was so
slow, reading through the partition with dd gives 482 secs for the
220G, ~450M/s. Setting the blocksize to 512 or 2048 didn't make much
of a difference.

I peppered the code with some messages to see what happens:
# time mount -t udf /dev/sdb3 /media/floppy
UDF-fs: check_vsd: sectorsize=2048
UDF-fs: check_vsd: sector offs=32768, s_blocksize=512, s_blocksize_bits=9
UDF-fs: read 107989660 sectors of total size 55290705920 bytes
UDF-fs: warning (device sdb3): udf_load_vrs: No VRS found
UDF-fs: Rescanning with blocksize 2048
UDF-fs: check_vsd: sectorsize=2048
UDF-fs: check_vsd: sector offs=32768, s_blocksize=2048, s_blocksize_bits=11
UDF-fs: read 107989660 sectors of total size 221162823680 bytes
UDF-fs: warning (device sdb3): udf_load_vrs: No VRS found
UDF-fs: warning (device sdb3): udf_fill_super: No partition found (1)
mount: wrong fs type, bad option, bad superblock on /dev/sdb3,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
real    352m4.740s
user    0m0.000s
sys     27m23.560s

Tried to mount other partitions, too, formatted to ext3, ext4, btrfs
and ntfs. The mount failed with those sooner, accidentally just
because there were some blocks near to the beginning with a zero byte
just at the right place.

Then I prepared an 'all 0xff' 4G image, and burnt it to a DVD. The
mount failed, but took only 25 minutes. 'Only', compared to the case
with the ssd. This truely doesn't reflect the throughput of the
devices, hopefully someone with more experience will have a clue.

The proposed solution changes the for() loop exit condition so that
not a zero byte but an invalid descriptor id is checked. Since all
current ids are 0-9A-Z, I think it's plausible to expect future ids
will have the same form. If not, the code could be changed later,
anyway.

Mounting the invalid ssd partition with the patch in debug mode looks like this:
# time mount -t udf /dev/sdb3 /media/floppy
UDF-fs: udf_is_vsd_id_valid: at offset 0x00008000 vsd.stdIdent[] = {
ff ff ff ff ff } : invalid
UDF-fs: warning (device sdb3): udf_load_vrs: No anchor found
UDF-fs: Rescanning with blocksize 2048
UDF-fs: udf_is_vsd_id_valid: at offset 0x00008000 vsd.stdIdent[] = {
ff ff ff ff ff } : invalid
UDF-fs: warning (device sdb3): udf_load_vrs: No anchor found
UDF-fs: warning (device sdb3): udf_fill_super: No partition found (1)
mount: wrong fs type, bad option, bad superblock on /dev/sdb3,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
real    0m0.009s
user    0m0.000s
sys     0m0.000s

And with a valid UDF fs:
# time mount -t udf /dev/sr1 /media/floppy
mount: block device /dev/sr1 is write-protected, mounting read-only
UDF-fs: udf_is_vsd_id_valid: at offset 0x00008000 vsd.stdIdent[] = {
43 44 30 30 31 } : valid (CD001)
UDF-fs: udf_is_vsd_id_valid: at offset 0x00008800 vsd.stdIdent[] = {
43 44 30 30 31 } : valid (CD001)
UDF-fs: udf_is_vsd_id_valid: at offset 0x00009000 vsd.stdIdent[] = {
42 45 41 30 31 } : valid (BEA01)
UDF-fs: udf_is_vsd_id_valid: at offset 0x00009800 vsd.stdIdent[] = {
4e 53 52 30 32 } : valid (NSR02)
UDF-fs: Partition marked readonly; forcing readonly mount
real    0m0.290s
user    0m0.000s
sys     0m0.000s

Please comment on the attached patch and merge it if acceptable. It
was made against 0bfd8ff (v3.9.4), but applied successfully to
v3.12-rc3, too.

Kind regards, Peter

Download attachment "super.patch" of type "application/octet-stream" (1989 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ