linux-kernel - Re: [PATCH -next] raid10: fix leak of io accounting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <76d32496-641e-c93a-df77-9ce9d9c1a1e1@huaweicloud.com>
Date:   Thu, 9 Mar 2023 14:56:49 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Guoqing Jiang <guoqing.jiang@...ux.dev>,
        Yu Kuai <yukuai1@...weicloud.com>, song@...nel.org
Cc:     linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
        yi.zhang@...wei.com, yangerkun@...wei.com,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next] raid10: fix leak of io accounting

Hi,

在 2023/03/09 14:36, Guoqing Jiang 写道:
> Hi,
> 
> What do you mean 'leak' here?

I try to mean that inflight counting is leaked, because it's increased
twice for one io.
> 
> On 3/4/23 15:01, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> handle_read_error() will resumit r10_bio by raid10_read_request(), which
>> will call bio_start_io_acct() again, while bio_end_io_acct() will only
>> be called once.
>>
>> Fix the problem by don't account io again from handle_read_error().
> 
> My understanding is it caused inaccurate io stats for bio which had a read
> error.
> 
>> Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting")
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> ---
>>   drivers/md/raid10.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index 6c66357f92f5..4f8edb6ea3e2 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -1173,7 +1173,7 @@ static bool regular_request_wait(struct mddev 
>> *mddev, struct r10conf *conf,
>>   }
>>   static void raid10_read_request(struct mddev *mddev, struct bio *bio,
>> -                struct r10bio *r10_bio)
>> +                struct r10bio *r10_bio, bool handle_error)
>>   {
>>       struct r10conf *conf = mddev->private;
>>       struct bio *read_bio;
>> @@ -1244,7 +1244,7 @@ static void raid10_read_request(struct mddev 
>> *mddev, struct bio *bio,
>>       }
>>       slot = r10_bio->read_slot;
>> -    if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
>> +    if (!handle_error && 
>> blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
>>           r10_bio->start_time = bio_start_io_acct(bio);
> 
> I think a simpler way is just check R10BIO_ReadError here.

No, I'm afraid this is incorrect because handle_read_error clears the
state before resubmiting the r10bio.

Thanks,
Kuai
> 
> Thanks,
> Guoqing
> .
>