lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <478C22A9.5000009@ce.jp.nec.com>
Date:	Tue, 15 Jan 2008 12:04:09 +0900
From:	"K.Tanaka" <k-tanaka@...jp.nec.com>
To:	linux-scsi@...r.kernel.org
CC:	linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
	dm-devel@...hat.com
Subject: [RFC] A SCSI fault injection framework using SystemTap.


I would like to introduce a SCSI fault injection framework using SystemTap.

Currently, kernel has Fault-injection framework and Faulty mode for md,
which can also be used for testing the error handling. But, they could
only produce fixed type of errors stochastically. In order to simulate
more realistic scsi disk faults, I have created a  new flexible fault injection
framework using SystemTap.

The new fault injection framework has the following features:

 1) The new framework is flexible, easy to change the condition without changing
    the kernel because actually they are SystemTap scripts.
    For example, device faults resulting in scsi command timeout, and media
    faults which could be corrected by writing data to the failed sector
    could be simulated using this framework.

 2) The new framework generates "pseudo" faults in the SCSI mid-layer.
    Any upper layer app/driver using the SCSI mid-layer can apply this framework.

 3) The new framework rewrite the status code and sense data for SCSI command and
    pass it to the upper layer. So the real error handling routine of the upper
    layer for I/O request can be tested.

I have tested the software RAID (md/dm-mirror) using this framework
and found some bugs.
 e.g.
  -The kernel thread for md RAID1 could cause a deadlock when the error handler for
    md RAID1 contends with the write access to the md RAID1 array.

  -dm-mirror's redundancy doesn't work. A read error from the disk consisting
   the array will be directory passed to the userspace, without reading from
   the other mirror.
   (It turns out that this issue is a known issue, but the patch is not merged.
    http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-raid1-handle-read-failures.patch)

There are also some other bugs for error handling routine in the multiple
fault situation. I will report the details about these bugs later.

The new framework is tested on Fedora8(i386) running with kernel 2.6.23.12.
So far, I'm cleaning up the tool set for release, and plan to post it in the near future.
If you are interested, take a look at it.
If you have any comments, please let me know.

-- 
------------------------------------------------------------------------
Kenichi TANAKA    | Open Source Software Platform Development Division
                  | Computers Software Operations Unit, NEC Corporation
                  | k-tanaka@...jp.nec.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ