[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150628205347.GA18403@sli.dy.fi>
Date: Sun, 28 Jun 2015 23:53:48 +0300
From: Sami Liedes <sami.liedes@....fi>
To: Goldwyn Rodrigues <rgoldwyn@...e.com>, Neil Brown <neilb@...e.de>,
bugzilla-daemon@...zilla.kernel.org
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [Bug 100491] New: Oops under bitmap_start_sync [md_mod] at boot
On Thu, Jun 25, 2015 at 09:02:45PM +0000, bugzilla-daemon@...zilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=100491
>
> Bug ID: 100491
> Summary: Oops under bitmap_start_sync [md_mod] at boot
[...]
> Reading all physical valumes. This may take a while...
> Found volume group "rootvg" using metadata type lvm2
> device-mapper: raid: Device 0 specified for rebuild: Clearing superblock
> md/raid1:mdX: active with 1 out of 2 mirrors
> mdX: invalid bitmap file superblock: bad magic
> md-cluster module not found.
> mdX: Could not setup cluster service (256)
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
> IP: [<ffffffff8159e4a9>] _raw_spin_lock_irq+0x29/0x70
> PGD 0
> Oops: 0002 [#1] PREEMPT SMP
[...]
I'm marking this as a regression in bugzilla, since this seems to
prevent booting on 4.1.0 at least in certain circumstances (namely
those which I have; I wonder if any raid1 recovery works?) while 4.0.6
boots correctly.
I bisected this down to one of four commits. Well, assuming that the
problem was caused by changes in drivers/md; a fair assumption, I
think. The commits are:
$ git bisect view --oneline
f9209a3 bitmap_create returns bitmap pointer
96ae923 Gather on-going resync information of other nodes
54519c5 Lock bitmap while joining the cluster
b97e9257 Use separate bitmaps for each nodes in the cluster
The crash happens whether or not CONFIG_MD_CLUSTER is enabled.
Here's the versions I tested:
git bisect start '--' 'drivers/md'
# bad: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1
# good: [39a8804455fb23f09157341d3ba7db6d7ae6ee76] Linux 4.0
# bad: [9ffc8f7cb9647b13dfe4d1ad0d5e1427bb8b46d6] md/raid5: don't do chunk aligned read on degraded array.
# bad: [6dc69c9c460b0cf05b5b3f323a8b944a2e52e76d] md: recover_bitmaps() can be static
# bad: [4b26a08af92c0d9c0bce07612b56ff326112321a] Perform resync for cluster node failure
# good: [cf921cc19cf7c1e99f730a2faa02d80817d684a2] Add node recovery callbacks
# skip: [96ae923ab659e37dd5fc1e05ecbf654e2f94bcbe] Gather on-going resync information of other nodes
# bad: [f9209a323547f054c7439a3bf67c45e64a054bdd] bitmap_create returns bitmap pointer
# skip: [54519c5f4b398bcfe599f652b4ef4004d5fa63ff] Lock bitmap while joining the cluster
Sami
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists