[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170214163437.GA23956@lst.de>
Date: Tue, 14 Feb 2017 17:34:37 +0100
From: "hch@....de" <hch@....de>
To: Dexuan Cui <decui@...rosoft.com>
Cc: "hch@....de" <hch@....de>, Jens Axboe <axboe@...nel.dk>,
Bart Van Assche <Bart.VanAssche@...disk.com>,
"hare@...e.com" <hare@...e.com>, "hare@...e.de" <hare@...e.de>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"jth@...nel.org" <jth@...nel.org>,
Nick Meier <Nick.Meier@...rosoft.com>,
"Alex Ng (LIS)" <alexng@...rosoft.com>,
Long Li <longli@...rosoft.com>,
"Adrian Suhov (Cloudbase Solutions SRL)" <v-adsuho@...rosoft.com>,
"Chris Valean (Cloudbase Solutions SRL)" <v-chvale@...rosoft.com>
Subject: Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event
lock when scheduling workqueue elements")
> I tested today's linux-next (next-20170214) + the 2 patches just now and got
> a weird result:
> sometimes the VM stills hung with a new calltrace (BUG: spinlock bad
> magic) , but sometimes the VM did boot up despite the new calltrace!
>
> Attached is the log of a "good" boot.
>
> It looks we have a memory corruption issue somewhere...
Yes.
> Actually previously I saw the "BUG: spinlock bad magic" message once, but I
> couldn't repro it later, so I didn't mention it to you.
Interesting.
>
> The good news is that now I can repro the "spinlock bad magic" message
> every time.
> I tried to dig into this by enabling Kernel hacking -> Memory debugging,
> but didn't find anything abnormal.
> Is it possible that the SCSI layer passes a wrong memory address?
It's possible, but this looks like it might be a different issue.
A few questions on the dmesg:
[ 6.208794] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current]
[ 6.209447] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code
[ 6.210043] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current]
[ 6.210618] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code
[ 6.212272] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current]
[ 6.212897] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code
[ 6.213474] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current]
[ 6.214051] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code
I didn't see anything like this in the other logs. Are these messages
something usual on HyperV VMs?
[ 6.358405] XFS (sdb1): Mounting V5 Filesystem
[ 6.404478] XFS (sdb1): Ending clean mount
[ 7.535174] BUG: spinlock bad magic on CPU#0, swapper/0/0
[ 7.536807] lock: host_ts+0x30/0xffffffffffffe1a0 [hv_utils], .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[ 7.538436] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170214+ #1
[ 7.539142] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016
[ 7.539142] Call Trace:
[ 7.539142] <IRQ>
[ 7.539142] dump_stack+0x63/0x82
[ 7.539142] spin_dump+0x78/0xc0
[ 7.539142] do_raw_spin_lock+0xfd/0x160
[ 7.539142] _raw_spin_lock_irqsave+0x4c/0x60
[ 7.539142] ? timesync_onchannelcallback+0x153/0x220 [hv_utils]
[ 7.539142] timesync_onchannelcallback+0x153/0x220 [hv_utils]
Can you resolve this address using gdb to a line of code? Once inside
gdb do:
l *(timesync_onchannelcallback+0x153)
Powered by blists - more mailing lists