[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <e18ec7ad7449f2aba885b93467005848745f4853.1700555778.git.bo.wu@vivo.com>
Date: Tue, 21 Nov 2023 01:55:29 -0700
From: Wu Bo <bo.wu@...o.com>
To: Alasdair Kergon <agk@...hat.com>,
Mike Snitzer <snitzer@...nel.org>,
Mikulas Patocka <mpatocka@...hat.com>
Cc: dm-devel@...ts.linux.dev, linux-kernel@...r.kernel.org,
Wu Bo <wubo.oduw@...il.com>, Wu Bo <bo.wu@...o.com>
Subject: [PATCH 2/2] dm verity: don't verity if readahead failed
We found an issue under Android OTA scenario that many BIOs have to do
FEC where the data under dm-verity is 100% complete and no corruption.
Android OTA has many dm-block layers, from upper to lower:
dm-verity
dm-snapshot
dm-origin & dm-cow
dm-linear
ufs
Dm tables have to change 2 times during Android OTA merging process.
When doing table change, the dm-snapshot will be suspended for a while.
During this interval, we found there are many readahead IOs are
submitted to dm_verity from filesystem. Then the kverity works are busy
doing FEC process which cost too much time to finish dm-verity IO. And
cause system stuck.
We add some debug log and find that each readahead IO need around 10s to
finish when this situation occurred. Because here has a IO
amplification:
dm-snapshot suspend
erofs_readahead // 300+ io is submitted
dm_submit_bio (dm_verity)
dm_submit_bio (dm_snapshot)
bio return EIO
bio got nothing, it's empty
verity_end_io
verity_verify_io
forloop range(0, io->n_blocks) // each io->nblocks ~= 20
verity_fec_decode
fec_decode_rsb
fec_read_bufs
forloop range(0, v->fec->rsn) // v->fec->rsn = 253
new_read
submit_bio (dm_snapshot)
end loop
end loop
dm-snapshot resume
Readahead BIO got nothing during dm-snapshot suspended. So all of them
will do FEC.
Each readahead BIO need to do io->n_blocks ~= 20 times verify.
Each block need to do fec, and every block need to do v->fec->rsn = 253
times read.
So during the suspend interval(~200ms), 300 readahead BIO make
300*20*253 IOs on dm-snapshot.
As readahead IO is not required by user space, and to fix this issue,
I think it would be better to pass it to upper layer to handle it.
Signed-off-by: Wu Bo <bo.wu@...o.com>
---
drivers/md/dm-verity-target.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 42b2483eb08c..d242e50ec869 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -668,7 +668,9 @@ static void verity_end_io(struct bio *bio)
verity_fec_init_io(io);
if (bio->bi_status &&
- (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
+ (!verity_fec_is_enabled(io->v) ||
+ verity_is_system_shutting_down() ||
+ (bio->bi_opf & REQ_RAHEAD))) {
verity_finish_io(io, bio->bi_status);
return;
}
--
2.25.1
Powered by blists - more mailing lists