[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87inmv8w7k.fsf@notabene.neil.brown.name>
Date: Mon, 27 Mar 2017 10:17:03 +1100
From: NeilBrown <neilb@...e.com>
To: Ming Lei <tom.leiming@...il.com>
Cc: Christoph Hellwig <hch@...radead.org>,
Jens Axboe <axboe@...nel.dk>,
linux-block <linux-block@...r.kernel.org>,
"open list\:SOFTWARE RAID \(Multiple Disks\) SUPPORT"
<linux-raid@...r.kernel.org>,
"open list\:DEVICE-MAPPER \(LVM\)" <dm-devel@...hat.com>,
Alasdair Kergon <agk@...hat.com>,
Mike Snitzer <snitzer@...hat.com>,
Shaohua Li <shli@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"Martin K . Petersen" <martin.petersen@...cle.com>
Subject: Re: [PATCH v3] block: trace completion of all bios.
On Fri, Mar 24 2017, Ming Lei wrote:
> On Fri, Mar 24, 2017 at 8:07 AM, NeilBrown <neilb@...e.com> wrote:
...
>> @@ -102,6 +102,8 @@ struct bio {
>> #define BIO_REFFED 8 /* bio has elevated ->bi_cnt */
>> #define BIO_THROTTLED 9 /* This bio has already been subjected to
>> * throttling rules. Don't do it again. */
>> +#define BIO_TRACE_COMPLETION 10 /* bio_endio() should trace the final completion
>> + * of this bio. */
>
> This may not be a good idea, since the flag space is quite small(12).
which means there are 2 unused bits and I only want to use 1. It should
fit.
I'm a bit perplexed why 4 bits a reserved for the BVEC_POOL number which
ranges from 0 to 7....
But if these bits really are a scarce resource, we should stop wasting
them.
The patch below makes bit 7 (BIO_CHAIN) available. We could probably make
bit 8 (BIO_REFFED) available using a similar technique.
BIO_QUIET is only relevant if bi_error is non-zero and bi_error has a
limited range, so a bit in there could be used instead. In fact,
MAX_ERRNO is 4096, so bi_error could be a 'short'. That could save 2
whole bytes ... or make 16 more flag bits available.
So I don't think you concern about a shortage of spare flag bits is valid.
Thanks,
NeilBrown
diff --git a/block/bio.c b/block/bio.c
index c1272986133e..1703786a206a 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -274,7 +274,7 @@ void bio_init(struct bio *bio, struct bio_vec *table,
unsigned short max_vecs)
{
memset(bio, 0, sizeof(*bio));
- atomic_set(&bio->__bi_remaining, 1);
+ atomic_set(&bio->__bi_remaining, 0);
atomic_set(&bio->__bi_cnt, 1);
bio->bi_io_vec = table;
@@ -300,7 +300,7 @@ void bio_reset(struct bio *bio)
memset(bio, 0, BIO_RESET_BYTES);
bio->bi_flags = flags;
- atomic_set(&bio->__bi_remaining, 1);
+ atomic_set(&bio->__bi_remaining, 0);
}
EXPORT_SYMBOL(bio_reset);
@@ -1794,20 +1794,15 @@ EXPORT_SYMBOL(bio_flush_dcache_pages);
static inline bool bio_remaining_done(struct bio *bio)
{
/*
- * If we're not chaining, then ->__bi_remaining is always 1 and
+ * If we're not chaining, then ->__bi_remaining is always 0 and
* we always end io on the first invocation.
*/
- if (!bio_flagged(bio, BIO_CHAIN))
+ if (atomic_read(&bio->__bi_remaining) == 0)
return true;
BUG_ON(atomic_read(&bio->__bi_remaining) <= 0);
- if (atomic_dec_and_test(&bio->__bi_remaining)) {
- bio_clear_flag(bio, BIO_CHAIN);
- return true;
- }
-
- return false;
+ return atomic_dec_and_test(&bio->__bi_remaining);
}
/**
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 8e521194f6fc..d15245544111 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -664,13 +664,19 @@ static inline struct bio *bio_list_get(struct bio_list *bl)
}
/*
- * Increment chain count for the bio. Make sure the CHAIN flag update
- * is visible before the raised count.
+ * Increment chain count for the bio.
*/
static inline void bio_inc_remaining(struct bio *bio)
{
- bio_set_flag(bio, BIO_CHAIN);
- smp_mb__before_atomic();
+ /*
+ * Calls to bio_inc_remaining() cannot race
+ * with the final call to bio_end_io(), and
+ * the first call cannot race with other calls,
+ * so if __bi_remaining appears to be zero, there
+ * can be no race which might change it.
+ */
+ if (atomic_read(&bio->__bi_remaining) == 0)
+ atomic_set(&bio->__bi_remaining, 1);
atomic_inc(&bio->__bi_remaining);
}
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index db7a57ee0e58..2deea4501d14 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -46,6 +46,15 @@ struct bio {
unsigned int bi_seg_front_size;
unsigned int bi_seg_back_size;
+ /* __bi_remaining records the number of completion events
+ * (i.e. calls to bi_end_io()) that need to happen before this
+ * bio is truly complete.
+ * A value of '0' means that there will only ever be one completion
+ * event, so there will be no racing and no need for an atomic operation
+ * to detect the last event.
+ * Any other value represents a simple count subject to atomic_inc() and
+ * atomic_dec_and_test().
+ */
atomic_t __bi_remaining;
bio_end_io_t *bi_end_io;
@@ -98,7 +107,7 @@ struct bio {
#define BIO_USER_MAPPED 4 /* contains user pages */
#define BIO_NULL_MAPPED 5 /* contains invalid user pages */
#define BIO_QUIET 6 /* Make BIO Quiet */
-#define BIO_CHAIN 7 /* chained bio, ->bi_remaining in effect */
+/* 7 unused */
#define BIO_REFFED 8 /* bio has elevated ->bi_cnt */
#define BIO_THROTTLED 9 /* This bio has already been subjected to
* throttling rules. Don't do it again. */
Download attachment "signature.asc" of type "application/pgp-signature" (833 bytes)
Powered by blists - more mailing lists