lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <loom.20150608T154819-45@post.gmane.org>
Date:	Mon, 8 Jun 2015 14:07:37 +0000 (UTC)
From:	Sergio Callegari <sergio.callegari@...il.com>
To:	linux-kernel@...r.kernel.org
Subject: Probable regression: extremely high IOWAIT on system with Iomega ZIP drive (Parallel ATA interface) after 3.16->3.17 kernel upgrade 

Hi,

I am experiencing a weird issue on an AMD Phenom II system with an AsRock
N68S motherboard (NVIDIA GeForce 7025 / nForce 630a chipset). The system has
an Iomega Zip 100 drive attached via an IDE connector - not exactly recent
hardware.

Everything was working fine up to kernel 3.16.x.

After a kernel upgrade, I occasionally see the system IoWait jumping high
and staying consistently high (~ 50%). Typically this occurs between a few
minutes and a few hours after boot.

The high iowait is also coupled to the kernel detecting processes hanging.

In fact, to see a process hanging, it is sufficient to try mounting a disk
placed in the zip drive. The mount command does not exit. Interestingly, if
I try to mount the zip drive /before/ the iowait jumps high, it is mounted
just fine.

The high iowait seems to be associated to no output in dmesg/syslog.
However, when the mount process hangs, the following output is produced:

 [11877.606063] INFO: task mount:14652 blocked for more than 120 seconds.
[11877.606077] Tainted: P C OE 3.19.0-18-generic #18-Ubuntu
[11877.606082] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[11877.606088] mount D ffff88006ad038f8 0 14652 14651 0x00000000
[11877.606099] ffff88006ad038f8 ffff880062ecebf0 0000000000014200
ffff88006ad03fd8
[11877.606108] 0000000000014200 ffff88011a818000 ffff880062ecebf0
ffff88011fcd4200
[11877.606115] ffff88006ad03a50 7fffffffffffffff ffff88006ad03a48
ffff880062ecebf0
[11877.606121] Call Trace:
[11877.606139] [<ffffffff817c4f99>] schedule+0x29/0x70
[11877.606149] [<ffffffff817c857c>] schedule_timeout+0x20c/0x280
[11877.606161] [<ffffffff8109ed1d>] ? ttwu_do_activate.constprop.94+0x5d/0x70
[11877.606169] [<ffffffff810a1c19>] ? try_to_wake_up+0x1e9/0x340
[11877.606178] [<ffffffff817c6954>] wait_for_completion+0xa4/0x170
[11877.606183] [<ffffffff810a1de0>] ? wake_up_state+0x20/0x20
[11877.606191] [<ffffffff8108ef1a>] flush_work+0xea/0x1c0
[11877.606200] [<ffffffff8108bb10>] ? destroy_worker+0xa0/0xa0
[11877.606206] [<ffffffff8108f0f8>] __cancel_work_timer+0x98/0x1b0
[11877.606214] [<ffffffff813949f1>] ? exact_lock+0x11/0x20
[11877.606223] [<ffffffff81509d72>] ? kobj_lookup+0x112/0x170
[11877.606230] [<ffffffff813939f0>] ? disk_map_sector_rcu+0x80/0x80
[11877.606237] [<ffffffff8108f243>] cancel_delayed_work_sync+0x13/0x20
[11877.606243] [<ffffffff81395991>] disk_block_events+0x81/0x90
[11877.606252] [<ffffffff8122d64b>] __blkdev_get+0x5b/0x490
[11877.606259] [<ffffffff8122dac1>] blkdev_get+0x41/0x390
[11877.606266] [<ffffffff8122de70>] ? blkdev_get_by_dev+0x60/0x60
[11877.606273] [<ffffffff8122decf>] blkdev_open+0x5f/0x90
[11877.606281] [<ffffffff811f0d82>] do_dentry_open+0x1d2/0x330
[11877.606288] [<ffffffff811f1049>] vfs_open+0x49/0x50
[11877.606296] [<ffffffff81201b47>] do_last+0x227/0x12c0
[11877.606305] [<ffffffff812041e8>] path_openat+0x88/0x610
[11877.606313] [<ffffffff8120598a>] do_filp_open+0x3a/0xb0
[11877.606320] [<ffffffff81212777>] ? __alloc_fd+0xa7/0x130
[11877.606328] [<ffffffff811f299a>] do_sys_open+0x12a/0x280
[11877.606334] [<ffffffff810963ef>] ? __put_cred+0x3f/0x60
[11877.606341] [<ffffffff811f1e70>] ? SyS_access+0x1c0/0x210
[11877.606348] [<ffffffff811f2b0e>] SyS_open+0x1e/0x20
[11877.606356] [<ffffffff817c990d>] system_call_fastpath+0x16/0x1b

When the high Iowait occurs, it often gets impossible to cleanly shutdown
the machine and a hard reset is required. Similarly, with the high iowait it
gets hard to test new kernels since the makeinitramfs or the grub update
phases hang forever.

Detaching the Iomega drive from the system seems to stop the issue.

I have verified that the issue does not exist with kernel 3.16.x by trying
3.16.7. However, the issue is present in 3.17.x, 3.18.x and 3.19.x.

I wonder if someone can point out what changes that could be related to the
issue have been introduced in the 3.16->3.17 transition and what to test to
try to isolate the regression.

Thanks!


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ