[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d6e6699-348a-af5a-0ab4-a7c4bd917ed1@deltatee.com>
Date: Mon, 30 May 2022 09:57:02 -0600
From: Logan Gunthorpe <logang@...tatee.com>
To: Christoph Hellwig <hch@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
Song Liu <song@...nel.org>,
Donald Buczek <buczek@...gen.mpg.de>,
Guoqing Jiang <guoqing.jiang@...ux.dev>,
Xiao Ni <xni@...hat.com>, Stephen Bates <sbates@...thlin.com>,
Martin Oliveira <Martin.Oliveira@...eticom.com>,
David Sloan <David.Sloan@...eticom.com>
Subject: Re: [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log
accesses
On 2022-05-30 00:01, Christoph Hellwig wrote:
> On Thu, May 26, 2022 at 10:36:00AM -0600, Logan Gunthorpe wrote:
>> The mdadm test 21raid5cache randomly fails with NULL pointer accesses
>> of conf->log when run repeatedly. conf->log was sort of protected with
>> RCU, but most dereferences were not done with the correct functions.
>>
>> Add rcu_read_locks(), rcu_dereference_protected() and rcu_access_pointers()
>> calls to the appropriate places and mark the pointer with __rcu.
>
> Looking at the code a bit more, is this really enough? Calls to
> r5c_is_writeback / r5c_confi_is_writeback are sprinkled all over the
> code, and my gut feeling is the value is not expected to change over
> way longer critical sections than this. So maybe the answer here is to
> fix up the release to be properly locked as it only affects the non-I/O
> slow path anyway.
Yeah, I think your gut feeling is correct. It looks like all the
is_writeback calls are in the IO path as well. I'll review this again
and see if we can just replace the RCU stuff and the paths that were
hitting NULL pointer deference with the taking of a lock.
Logan
Powered by blists - more mailing lists