[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b84fb49-bf63-3442-8c99-d565e134f2@redhat.com>
Date: Wed, 29 Nov 2023 18:26:08 +0100 (CET)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Wu Bo <bo.wu@...o.com>
cc: Alasdair Kergon <agk@...hat.com>,
Mike Snitzer <snitzer@...nel.org>, dm-devel@...ts.linux.dev,
linux-kernel@...r.kernel.org, Wu Bo <wubo.oduw@...il.com>,
Eric Biggers <ebiggers@...nel.org>, stable@...r.kernel.org
Subject: Re: [PATCH v2 2/2] dm verity: don't verity if readahead failed
On Tue, 21 Nov 2023, Wu Bo wrote:
> We found an issue under Android OTA scenario that many BIOs have to do
> FEC where the data under dm-verity is 100% complete and no corruption.
>
> Android OTA has many dm-block layers, from upper to lower:
> dm-verity
> dm-snapshot
> dm-origin & dm-cow
> dm-linear
> ufs
>
> Dm tables have to change 2 times during Android OTA merging process.
> When doing table change, the dm-snapshot will be suspended for a while.
> During this interval, we found there are many readahead IOs are
> submitted to dm_verity from filesystem. Then the kverity works are busy
> doing FEC process which cost too much time to finish dm-verity IO. And
> cause system stuck.
>
> We add some debug log and find that each readahead IO need around 10s to
> finish when this situation occurred. Because here has a IO
> amplification:
>
> dm-snapshot suspend
> erofs_readahead // 300+ io is submitted
> dm_submit_bio (dm_verity)
> dm_submit_bio (dm_snapshot)
> bio return EIO
> bio got nothing, it's empty
> verity_end_io
> verity_verify_io
> forloop range(0, io->n_blocks) // each io->nblocks ~= 20
> verity_fec_decode
> fec_decode_rsb
> fec_read_bufs
> forloop range(0, v->fec->rsn) // v->fec->rsn = 253
> new_read
> submit_bio (dm_snapshot)
> end loop
> end loop
> dm-snapshot resume
>
> Readahead BIO got nothing during dm-snapshot suspended. So all of them
> will do FEC.
> Each readahead BIO need to do io->n_blocks ~= 20 times verify.
> Each block need to do fec, and every block need to do v->fec->rsn = 253
> times read.
> So during the suspend interval(~200ms), 300 readahead BIO make
> 300*20*253 IOs on dm-snapshot.
>
> As readahead IO is not required by user space, and to fix this issue,
> I think it would be better to pass it to upper layer to handle it.
>
> Cc: stable@...r.kernel.org
> Fixes: a739ff3f543a ("dm verity: add support for forward error correction")
> Signed-off-by: Wu Bo <bo.wu@...o.com>
> ---
> drivers/md/dm-verity-target.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index beec14b6b044..14e58ae70521 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -667,7 +667,9 @@ static void verity_end_io(struct bio *bio)
> struct dm_verity_io *io = bio->bi_private;
>
> if (bio->bi_status &&
> - (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
> + (!verity_fec_is_enabled(io->v) ||
> + verity_is_system_shutting_down() ||
> + (bio->bi_opf & REQ_RAHEAD))) {
> verity_finish_io(io, bio->bi_status);
> return;
> }
> --
> 2.25.1
>
Reviewed-by: Mikulas Patocka <mpatocka@...hat.com>
Powered by blists - more mailing lists