lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <bug-56821-13602@https.bugzilla.kernel.org/>
Date:	Fri, 19 Apr 2013 12:25:53 +0000 (UTC)
From:	bugzilla-daemon@...zilla.kernel.org
To:	linux-ext4@...r.kernel.org
Subject: [Bug 56821] New: an ext4 commit ee0906f causes weird disk hangs

https://bugzilla.kernel.org/show_bug.cgi?id=56821

           Summary: an ext4 commit ee0906f causes weird disk hangs
           Product: File System
           Version: 2.5
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4@...nel-bugs.osdl.org
        ReportedBy: kynde@...ray.fi
        Regression: Yes


Created an attachment (id=99301)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=99301)
A console msg often seen during the hang

The commit (ee0906fc8da3447d168a73570754a160ecbe399b ext4: use
s_extent_max_zeroout_kb value as number of kb) causes a strange disk/raid/fs
hang for me.

Steps to reproduce:
1) login 
2) startx (I've tried with nv and nvidia)
3) launch thunderbird and wait 3..10 secs

Expected results:
- just another day in the office

Actual results:
- A hang. First I see some refreshes not happening and shortly I can't do
anything besides jump from X to consoles and back. I tap something out on those
terminals that are still live, but any disk access will hang them, too. The
attached console_msg.txt pops out sometimes if I wait long enough. Magic sysrq
sync,mount ro, boot is what I do next.

I've used practically every stable release on this box since some time before
3.0 without problems. And ever since 3.8.5 I've been stuck to 3.8.4. Since then
I've tried every stable release up to 3.8.8 and none of them work.

The ee0906f commit seems to cause it. I did double checks on surrounding
commits, but not more than that. I takes 10 minutes to resync my raid-1 after a
failure and that kinda limits my enthusiasm to work it further on my own. No
damage seems to be caused by such an event though. The raid sync succeeds every
time it only takes a while.

The setup is an updated Fedora 18 on an AMD 4184, 16 Gb ram, LSI SAS controller
with two 300GB disks. Three partitions each, first on both is a 50Gb raid1 ext4
as root and second of both is a 100Gb raid1 ext4 as /home. Third partitions are
non-raid old ext3 or ext4 filesystems that aren't mounted or used.

I haven't managed to cause the hang when outside of X. I've tried some kernel
compiling and catting files to null, but no. Equally while in X (nv or nvidia,
doesn't matter) thunderbird seems to trigger it. It launches fully but within a
few to ten seconds things start to fail. Another interesting tid bit is that
the disk leds in the array both get turned off, which is anomalous. Usually
they only blink during access.

I'm willing to provide information and try out things, just let me know what
you need.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ