lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 12 Jul 2014 21:50:43 +0000
From:	"Elliott, Robert (Server Storage)" <>
To:	Benjamin LaHaise <>
CC:	Christoph Hellwig <>,
	Jeff Moyer <>, Jens Axboe <>,
	"" <>,
	James Bottomley <>,
	Bart Van Assche <>,
	"" <>,
	"" <>
Subject: RE: scsi-mq V2

> -----Original Message-----
> From: Benjamin LaHaise []
> Sent: Friday, 11 July, 2014 9:55 AM
> To: Elliott, Robert (Server Storage)
> Cc: Christoph Hellwig; Jeff Moyer; Jens Axboe;; James
> Bottomley; Bart Van Assche;; linux-
> Subject: Re: scsi-mq V2
> Can you try the below totally untested patch instead?  It looks like
> put_reqs_available() is not irq-safe.

With that addition alone, fio still runs into the same problem.

I added the same fix to get_reqs_available, which also accesses 
kcpu->reqs_available, and the test has run for 35 minutes with 
no problem.

Patch applied:

diff --git a/fs/aio.c b/fs/aio.c
index e59bba8..8e85e26 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -830,16 +830,20 @@ void exit_aio(struct mm_struct *mm)
 static void put_reqs_available(struct kioctx *ctx, unsigned nr)
 	struct kioctx_cpu *kcpu;
+	unsigned long flags;
 	kcpu = this_cpu_ptr(ctx->cpu);
+	local_irq_save(flags);
 	kcpu->reqs_available += nr;
 	while (kcpu->reqs_available >= ctx->req_batch * 2) {
 		kcpu->reqs_available -= ctx->req_batch;
 		atomic_add(ctx->req_batch, &ctx->reqs_available);
+	local_irq_restore(flags);
@@ -847,10 +851,12 @@ static bool get_reqs_available(struct kioctx *ctx)
 	struct kioctx_cpu *kcpu;
 	bool ret = false;
+	unsigned long flags;
 	kcpu = this_cpu_ptr(ctx->cpu);
+	local_irq_save(flags);
 	if (!kcpu->reqs_available) {
 		int old, avail = atomic_read(&ctx->reqs_available);
@@ -869,6 +875,7 @@ static bool get_reqs_available(struct kioctx *ctx)
 	ret = true;
+	local_irq_restore(flags);
 	return ret;

I will see if that solves the problem with the scsi-mq-3 tree, or 
at least some of the bisect trees leading up to it.

A few other comments:

1. Those changes boost _raw_spin_lock_irqsave into first place
in perf top:

  6.59%  [kernel]                    [k] _raw_spin_lock_irqsave
  4.37%  [kernel]                    [k] put_compound_page
  2.87%  [scsi_debug]                [k] sdebug_q_cmd_hrt_complete
  2.74%  [kernel]                    [k] _raw_spin_lock
  2.73%  [kernel]                    [k] apic_timer_interrupt
  2.41%  [kernel]                    [k] do_blockdev_direct_IO
  2.24%  [kernel]                    [k] __get_page_tail
  1.97%  [kernel]                    [k] _raw_spin_unlock_irqrestore
  1.87%  [kernel]                    [k] scsi_queue_rq
  1.76%  [scsi_debug]                [k] schedule_resp

Maybe (later) kcpu->reqs_available should converted to an atomic,
like ctx->reqs_available, to reduce that overhead?

2. After the f8567a3 patch, aio_complete has one early return that 
bypasses the call to put_reqs_available.  Is that OK, or does
that mean that sync iocbs will now eat up reqs_available?

         * Special case handling for sync iocbs:
         *  - events go directly into the iocb for fast handling
         *  - the sync task with the iocb in its stack holds the single iocb
         *    ref, no other paths have a way to get another ref
         *  - the sync task helpfully left a reference to itself in the iocb
        if (is_sync_kiocb(iocb)) {
                iocb->ki_user_data = res;
                iocb->ki_ctx = ERR_PTR(-EXDEV);

3. The f8567a3 patch renders this comment in aio.c out of date - it's 
no longer incremented when pulled off the ringbuffer, but is now 
incremented when aio_complete is called.

        struct {
                 * This counts the number of available slots in the ringbuffer,
                 * so we avoid overflowing it: it's decremented (if positive)
                 * when allocating a kiocb and incremented when the resulting
                 * io_event is pulled off the ringbuffer.
                 * We batch accesses to it with a percpu version.
                atomic_t        reqs_available;
        } ____cacheline_aligned_in_smp;

Rob Elliott    HP Server Storage

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Powered by blists - more mailing lists