lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 2 Jan 2007 16:49:21 -0800
From:	Zach Brown <zach.brown@...cle.com>
To:	"Chen, Kenneth W" <kenneth.w.chen@...el.com>
Cc:	"'Andrew Morton'" <akpm@...l.org>, <linux-aio@...ck.org>,
	<linux-kernel@...r.kernel.org>,
	"'Benjamin LaHaise'" <bcrl@...ck.org>, <suparna@...ibm.com>
Subject: Re: [patch] aio: add per task aio wait event condition


On Dec 29, 2006, at 6:31 PM, Chen, Kenneth W wrote:

> The AIO wake-up notification from aio_complete is really inefficient
> in current AIO implementation in the presence of process waiting in
> io_getevents().

Yeah, it's a real deficiency.  Thanks for taking a stab at it.

> This patch adds a wait condition to the wait queue and only wake-up
> process when that condition meets.  And this condition is added on a
> per task base for handling multi-threaded app that shares single  
> ioctx.

But only one of the waiting tasks is tested, the one at the head of  
the list.  It looks like this change could starve a io_getevents()  
with a low min_nr in the presence of another io_getevents() with a  
larger min_nr.

> Before:
>  0  0      0 3972608   7056  31312    0    0 14100     0 7885  
> 13747  0  2 98  0
> After:
>  0  0      0 3972608   7056  31312    0    0 13800     0 7885     
> 42  0  2 98  0

Nice.  What min_nr was used in this test?

> +struct aio_wait_queue {
> +	int		nr_wait;	/* wake-up condition */

It appears that this is never assigned a negative?  Can we make it  
that explicit in the type so that we reviewers don't have to worry  
about wrapping and signed comparisons?

> -	DECLARE_WAITQUEUE(wait, tsk);
> +	struct aio_wait_queue wait;

> +	aio_init_wait(&wait);

This just changed from using default_wake_function() to  
autoremove_wait_function().  Very sneaky!  wait_for_all_aios() should  
be adding the wait queue before going to sleep each time.  (better  
still to just use wait_event()).

Was this on purpose?  I'm all for it as a way to reduce wakeups from  
a stream of completions to a single waiter.

> +	nr_evt = ring->tail - ring->head;
> +	if (nr_evt < 0)
> +		nr_evt += info->nr;

  int = unsigned - unsigned;
  if (int < 0)

My head already hurts.  Can we clean this up so one doesn't have to  
live and breath type conversion rules to tell if this code is correct?

> +	if (waitqueue_active(&ctx->wait)) {
> +		struct aio_wait_queue *wait;
> +		wait = container_of(ctx->wait.task_list.next,
> +				    struct aio_wait_queue, wait.task_list);
> +		if (nr_evt >= wait->nr_wait)
> +			wake_up(&ctx->wait);
> +	}

First is the fear of starvation as mentioned previously.

issue 2 ops
first io_getevents sleeps with a min_nr of 2
second io_getevents sleeps with min_nr of 3
2 ops complete but only test the second sleeper's min_nr of 3
first sleeper twiddles thumbs

This makes me think this elegant task_list approach is doomed.  I  
think this is what stopped Ben and I from being interested in this  
last time we talked about it :).

Also, is that container_of() and dereference safe in the presence of  
racing wake-ups?  It looks like we could get deref a freed wait and  
get a bogus nr_wait and decide not to wake.

Andrew, I fear we should remove this from -mm until it's fixed up.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ