[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5743C2BD.3080902@lwfinger.net>
Date: Mon, 23 May 2016 21:55:57 -0500
From: Larry Finger <Larry.Finger@...inger.net>
To: Al Viro <viro@...IV.linux.org.uk>
Cc: LKML <linux-kernel@...r.kernel.org>
Subject: Re: Regression in 4.6.0-git - bisected to commit dd254f5a382c
On 05/23/2016 07:18 PM, Al Viro wrote:
> On Mon, May 23, 2016 at 04:30:43PM -0500, Larry Finger wrote:
>> The mainline kernels past 4.6.0 fail hang when logging in. There are no
>> error messages, and the machine seems to be waiting for some event that
>> never happens.
>>
>> The problem has been bisected to commit dd254f5a382c ("fold checks into
>> iterate_and_advance()"). The bisection has been verified.
>>
>> The problem is the call from iov_iter_advance(). When I reinstated the old
>> macro with a new name and used it in that routine, the system works.
>> Obviously, the call that seems to be incorrect has some benefits. My
>> quich-and-dirty patch is attached.
>>
>> I will be willing to test any patch you prepare.
>
> Hangs where and how? A reproducer, please... This is really weird - the
> only change there is in the cases when
> * iov_iter_advance(i, n) is called with n greater than the remaining
> amount. It's a bug, plain and simple - old variant would've been left in
> seriously buggered state and at the very least we want to catch any such
> places for the sake of backports
> * iov_iter_advance(i, 0) - both old and new code leave *i unchanged,
> but the old one dereferences i->iov[0], which be pointing beyond the end of
> array by that point. The value read from there was not used by the old code,
> at that.
>
> Could you slap WARN_ON(size > i->count) in the very beginning of
> iov_iter_advance() (the mainline variant) and see what triggers on your
> reproducer?
The hang is when you try to log in. It asks for a password and the system never
returns, and nothing is logged. The system will switch between the various
CTRL-ALT-Fn screens, but that is about the most it will do.
Adding WARN_ON(size > i->count) showed nothing. I got the same result for
WARN_ON(!i->count). A WARN_ON(!size) does trigger the following traceback:
[ 15.030907] ------------[ cut here ]------------
[ 15.030913] WARNING: CPU: 0 PID: 353 at lib/iov_iter.c:529
iov_iter_advance+0xf6/0x240
[ 15.030914] Modules linked in: af_packet nfs fscache arc4 rtsx_pci_sdmmc
mmc_core rtsx_pci_ms memstick x86_pkg_temp_thermal kvm_intel iwlmvm kvm mac80211
snd_hda_c
odec_generic irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper iwlwifi
ablk_helper cryptd snd_h
da_intel snd_hda_codec e1000e snd_hwdep snd_hda_core snd_pcm cfg80211 pcspkr
serio_raw rtsx_pci snd_timer xhci_pci snd lpc_ich ptp mfd_core pps_core xhci_hcd
soundcor
e thermal toshiba_acpi toshiba_bluetooth sparse_keymap wmi rfkill battery
acpi_cpufreq ac processor dm_mod i915 i2c_algo_bit drm_kms_helper syscopyarea
sysfillrect sy
simgblt fb_sys_fops drm sr_mod cdrom video button sg autofs4
[ 15.030965] CPU: 0 PID: 353 Comm: systemd-journal Not tainted
4.6.0-09084-g75b5796-dirty #89
[ 15.030966] Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20
04/17/2014
[ 15.030968] 0000000000000000 ffff88021fc07d40 ffffffff813e4d1e 0000000000000000
[ 15.030972] 0000000000000000 ffff88021fc07d80 ffffffff810702b1 00000211c4105ac0
[ 15.030975] ffff88021fc07e08 0000000000000000 ffff88021fc07f08 ffffffff814bacc0
[ 15.030978] Call Trace:
[ 15.030981] [<ffffffff813e4d1e>] dump_stack+0x67/0x99
[ 15.030985] [<ffffffff810702b1>] __warn+0xd1/0xf0
[ 15.030989] [<ffffffff814bacc0>] ? tty_compat_ioctl+0xe0/0xe0
[ 15.030991] [<ffffffff8107039d>] warn_slowpath_null+0x1d/0x20
[ 15.030994] [<ffffffff813f7716>] iov_iter_advance+0xf6/0x240
[ 15.030997] [<ffffffff81223161>] do_loop_readv_writev+0x51/0xc0
[ 15.030999] [<ffffffff814bacc0>] ? tty_compat_ioctl+0xe0/0xe0
[ 15.031002] [<ffffffff812245ff>] do_readv_writev+0x1ef/0x210
[ 15.031006] [<ffffffff81238c86>] ? do_vfs_ioctl+0x96/0x6a0
[ 15.031008] [<ffffffff8122484f>] vfs_writev+0x3f/0x50
[ 15.031010] [<ffffffff812248b5>] do_writev+0x55/0xd0
[ 15.031013] [<ffffffff812259a0>] SyS_writev+0x10/0x20
[ 15.031016] [<ffffffff81794b65>] entry_SYSCALL_64_fastpath+0x18/0xa8
[ 15.031019] ---[ end trace 8c776b094504066d ]---
Two of these are logged for each boot.
If I make iov_iter_advance() look as follows, my system will boot:
void iov_iter_advance(struct iov_iter *i, size_t size)
{
WARN_ON(!size);
if (size)
iterate_and_advance(i, size, v, 0, 0, 0)
else
iterate_and_advance_nocheck(i, size, v, 0, 0, 0)
}
EXPORT_SYMBOL(iov_iter_advance);
Larry
Powered by blists - more mailing lists