lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 19 Jan 2011 14:21:23 +0100
From:	Jan Kara <jack@...e.cz>
To:	Nick Piggin <npiggin@...il.com>
Cc:	Jan Kara <jack@...e.cz>, Jeff Moyer <jmoyer@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [patch] fs: aio fix rcu lookup

On Wed 19-01-11 11:20:45, Nick Piggin wrote:
> On Wed, Jan 19, 2011 at 10:52 AM, Jan Kara <jack@...e.cz> wrote:
> > On Wed 19-01-11 09:17:23, Nick Piggin wrote:
> >> On Wed, Jan 19, 2011 at 6:01 AM, Jan Kara <jack@...e.cz> wrote:
> >> > On Tue 18-01-11 10:24:24, Nick Piggin wrote:
> >> >> On Tue, Jan 18, 2011 at 6:07 AM, Jeff Moyer <jmoyer@...hat.com> wrote:
> >> >> > Nick Piggin <npiggin@...il.com> writes:
> >> >> >> Do you agree with the theoretical problem? I didn't try to
> >> >> >> write a racer to break it yet. Inserting a delay before the
> >> >> >> get_ioctx might do the trick.
> >> >> >
> >> >> > I'm not convinced, no.  The last reference to the kioctx is always the
> >> >> > process, released in the exit_aio path, or via sys_io_destroy.  In both
> >> >> > cases, we cancel all aios, then wait for them all to complete before
> >> >> > dropping the final reference to the context.
> >> >>
> >> >> That wouldn't appear to prevent a concurrent thread from doing an
> >> >> io operation that requires ioctx lookup, and taking the last reference
> >> >> after the io_cancel thread drops the ref.
> >> >>
> >> >> > So, while I agree that what you wrote is better, I remain unconvinced of
> >> >> > it solving a real-world problem.  Feel free to push it in as a cleanup,
> >> >> > though.
> >> >>
> >> >> Well I think it has to be technically correct first. If there is indeed a
> >> >> guaranteed ref somehow, it just needs a comment.
> >> >  Hmm, the code in io_destroy() indeed looks fishy. We delete the ioctx
> >> > from the hash table and set ioctx->dead which is supposed to stop
> >> > lookup_ioctx() from finding it (see the !ctx->dead check in
> >> > lookup_ioctx()). There's even a comment in io_destroy() saying:
> >> >        /*
> >> >         * Wake up any waiters.  The setting of ctx->dead must be seen
> >> >         * by other CPUs at this point.  Right now, we rely on the
> >> >         * locking done by the above calls to ensure this consistency.
> >> >         */
> >> > But since lookup_ioctx() is called without any lock or barrier nothing
> >> > really seems to prevent the list traversal and ioctx->dead test to happen
> >> > before io_destroy() and get_ioctx() after io_destroy().
> >> >
> >> > But wouldn't the right fix be to call synchronize_rcu() in io_destroy()?
> >> > Because with your fix we could still return 'dead' ioctx and I don't think
> >> > we are supposed to do that...
> >>
> >> With my fix we won't oops, I was a bit concerned about ->dead,
> >> yes but I don't know what semantics it is attempted to have there.
> >  But wouldn't it do something bad if the memory gets reallocated for
> > something else and set to non-zero? E.g. memory corruption?
> 
> I don't see how it would with my patch.
  Ah, you are right. Since the whole structure gets freed only after the RCU
period expires, we are guaranteed to see 0 in ctx->users until we drop
rcu_read_lock. Maybe a comment like the above would be useful at the place
where you use try_get_ioctx() but your patch works.

> >> synchronize_rcu() in io_destroy()  does not prevent it from returning
> >> as soon as lookup_ioctx drops the rcu_read_lock().
> >  Yes, exactly. So references obtained before synchronize_rcu() would be
> > completely fine and valid references and there would be no references after
> > synchronize_rcu() because they'd see 'dead' set. But looking at the code
> > again it still would not be enough because we could still race with
> > io_submit_one() adding new IO to the ioctx which will be freed just after
> > put_ioctx() in do_io_submit().
> >
> > The patch below implements what I have in mind - it should be probably
> > split into two but I'd like to hear comments on that before doing these
> > cosmetic touches ;)
> 
> Well this seems to solve it too, but it is 2 problems here. It is changing
> the semantics of io_destroy which requires the big synchronize_rcu()
> hammer.
> 
> But I don't believe that's necessarily desirable, or required. In fact it is
> explicitly not reuired because it only says that it _may_ cancel outstanding
> requests.
  Well, we are not required to cancel all the outstanding AIO because of the
API requirement, that's granted. But we must do it because of the way how
the code is written. Outstanding IO requests reference ioctx but they are
not counted in ctx->users but in ctx->reqs_active. So the code relies on
the fact that the reference held by the hash table protects ctx from being
freed and io_destroy() waits for requests before dropping the last
reference to ctx. But there's the second race I describe making it possible
for new IO to be created after io_destroy() has waited for all IO to
finish...

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ