lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <466E9220.5050507@dgreaves.com>
Date:	Tue, 12 Jun 2007 13:31:28 +0100
From:	David Greaves <david@...eaves.com>
To:	David Chinner <dgc@....com>
Cc:	Tejun Heo <htejun@...il.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, xfs@....sgi.com,
	"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>,
	linux-pm <linux-pm@...ts.osdl.org>, Neil Brown <neilb@...e.de>
Subject: Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

[RESEND since I sent this late last friday and it's probably been buried by now.]

I had this as a PS, then I thought, we could all be wasting our time...

I don't like these "Section mismatch" warnings but that's because I'm paranoid
rather than because I know what they mean. I'll be happier when someone says
"That's OK, I know about them, they're not the problem"

WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch:
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to
.init.text: (between 'kthreadd' and 'init_waitqueue_head')

I'm paranoid because Andrew Morton said a couple of weeks ago:
> Could the people who write these bugs, please, like, fix them?
> It's not trivial noise.  These things lead to kernel crashes.

Anyhow...

David Chinner wrote:
> sync just guarantees that metadata changes are logged and data is
> on disk - it doesn't stop the filesystem from doing anything after
> the sync...
No, but there are no apps accessing the filesystem. It's just available for NFS
serving. Seems safer before potentially hanging the machine?


Also I made these changes to the kernel:
cu:/boot# diff config-2.6.22-rc4-TejuTst-dbg3-dirty
config-2.6.22-rc4-TejuTst-dbg1-dirty
3,4c3,4
< # Linux kernel version: 2.6.22-rc4-TejuTst-dbg3
< # Thu Jun  7 20:00:34 2007
---
> # Linux kernel version: 2.6.22-rc4-TejuTst3
> # Thu Jun  7 10:59:21 2007
242,244c242
< CONFIG_PM_DEBUG=y
< CONFIG_DISABLE_CONSOLE_SUSPEND=y
< # CONFIG_PM_TRACE is not set
---
> # CONFIG_PM_DEBUG is not set

positive: I can now get sysrq-t :)
negative: if I build skge into the kernel the behaviour changes so I can't run
netconsole

Just to be sure I tested and this kernel suspends/restores with /huge unmounted.
It also hangs without an umount so the behaviour is the same.

> Ok, so a clean inode is sufficient to prevent hibernate from working.
> 
> So, what's different between a sync and a remount?
> 
> do_remount_sb() does:
> 
>     599         shrink_dcache_sb(sb);
>     600         fsync_super(sb);
> 
> of which a sync does neither. sync does what fsync_super() does in
> different sort of way, but does not call sync_blockdev() on each
> block device. It looks like that is the two main differences between
> sync and remount - remount trims the dentry cache and syncs the blockdev,
> sync doesn't.
> 
>>> What about freezing the filesystem?
>> cu:~# xfs_freeze -f /huge
>> cu:~# /usr/net/bin/hibernate
>> [but this doesn't even hibernate - same as the 'touch']
> 
> I suspect that the frozen filesystem might cause other problems
> in the hibernate process. However, while a freeze calls sync_blockdev()
> it does not trim the dentry cache.....
> 
> So, rather than a remount before hibernate, lets see if we can 
> remove the dentries some other way to determine if removing excess
> dentries/inodes from the caches makes a difference. Can you do:
> 
> # touch /huge/foo
> # sync
> # echo 1 > /proc/sys/vm/drop_caches
> # hibernate
success
> 
> # touch /huge/bar
> # sync
> # echo 2 > /proc/sys/vm/drop_caches
> # hibernate
success
> 
> # touch /huge/baz
> # sync
> # echo 3 > /proc/sys/vm/drop_caches
> # hibernate
success

So I added
# touch /huge/bork
# sync
# hibernate

And it still succeeded - sigh.

So I thought a bit and did:
rm /huge/b* /huge/foo

> Clean boot
> # touch /huge/bar
> # sync
> # echo 2 > /proc/sys/vm/drop_caches
> # hibernate
hangs on suspend (sysrq-b doesn't work)

> Clean boot
> # touch /huge/baz
> # sync
> # echo 3 > /proc/sys/vm/drop_caches
> # hibernate
hangs on suspend (sysrq-b doesn't work)

So I rebooted and hibernated to make sure I'm not having random behaviour - yep,
hang on resume (as per usual).

Now I wonder if any other mounts have an effect...
reboot and umount /dev/hdb2 xfs fs, - hang on hibernate


I'm confused. I'm going to order chinese takeaway and then find a serial cable...

David
PS 2.6.21.1 works fine.
PPS the takeaway was nice.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ