[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d0dfd5ad-12d4-c6d1-68b2-a112d3f3c163@huaweicloud.com>
Date: Thu, 16 Mar 2023 09:26:30 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Logan Gunthorpe <logang@...tatee.com>,
Paul Menzel <pmenzel@...gen.mpg.de>
Cc: Yu Kuai <yukuai1@...weicloud.com>, agk@...hat.com,
snitzer@...nel.org, song@...nel.org, linux-kernel@...r.kernel.org,
linux-raid@...r.kernel.org, yi.zhang@...wei.com,
yangerkun@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v2 0/5] md: fix uaf for sync_thread
Hi,
在 2023/03/16 6:55, Logan Gunthorpe 写道:
>
>
> On 2023-03-15 02:30, Paul Menzel wrote:
>> Am 15.03.23 um 07:18 schrieb Yu Kuai:
>>> I tested this pathset with mdadm tests, and there are no new regression,
>>> by the way, following test will failed with or without this patchset:
>>>
>>> 01raid6integ
>>> 04r1update
>>> 05r6tor0
>>> 10ddf-create
>>> 10ddf-fail-spare
>>> 10ddf-fail-stop-readd
>>> 10ddf-geometry
>>
>> As you improved the tests in the past, can you confirm, these failed on
>> your test systems too and are fixed now?
>
> Hmm, well Yu did not claim that those tests were fixed. If you re-read
> what was said, the tests listed failed with or without the new changes.
> As I read it, Yu asserts no new regressions were created with the patch
> set, not that failing tests were fixed.
>
> Unfortunately, the tests listed are largely not ones I saw failing the
> last time I ran the tests (though it's been a few months since I last
> tried). I know 01raid6integ used to fail some of the time, but the other
> 6 tests mentioned worked the last time I ran them; and there are many
> other tests that failed when I ran them. (My notes on which tests are
> broken are included in the most recent mdadm tree in tests/*.broken)
>
> I was going to try and confirm that no new regressions were introduced
> by Yu's patches, but seems the tests are getting worse. I tried running
> the tests on the current md-next branch and found that one of the early
> tests, 00raid5-zero, hangs indefinitely. I quickly ran the same test on
> v6.3-rc2 and found that it runs just fine there. So it looks like
> there's already a regression in md-next that is not part of this series
> and I don't have the time to dig into the root cause right now.
>
> Yu's patches don't apply cleanly to v6.3-rc2 and I can't run the tests
> against md-next; so I didn't bother running them, but I did do a quick
> review. The locking changes make sense to me so it might be worth
> merging for correctness. However, I'm not entirely sure it's the best
> solution -- the md thread stuff seems like a bit of a mess and passing
> an mddev to thread functions that were not related to the mddev to get a
> lock seems to just make the mess a bit worse.
>
> For example, it seems a bit ugly to me for the lock mddev->thread_lock
> to protect the access of a pointer in struct r5l_log. Just spit-balling,
> but perhaps RCU would be more appropriate here. Then md_wakeup_thread()
> would just need to hold the RCU read lock when dereferencing, and
> md_unregister_thread() would just need to synchronize_rcu() before
> stopping and freeing the thread. This has the benefit of not requiring
> the mddev object for every md_thread and would probably require a lot
> less churn than the current patches.
Thanks for your suggestion, this make sense to me. I'll try to use rcu.
Thanks,
Kuai
>
> Logan
>
>
>
>
> .
>
Powered by blists - more mailing lists