lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOMFOmUT-WGbV57qDafM-NJ+=s1OzX-2Q6G6hAh6EV8Vt9yBZw@mail.gmail.com>
Date:	Tue, 20 May 2014 11:39:55 -0700
From:	Anatol Pomozov <anatol.pomozov@...il.com>
To:	Sebastian Ott <sebott@...ux.vnet.ibm.com>
Cc:	Benjamin LaHaise <bcrl@...ck.org>, linux-aio@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: hanging aio process

Hi

On Tue, May 20, 2014 at 11:09 AM, Sebastian Ott
<sebott@...ux.vnet.ibm.com> wrote:
> Hi,
>
> On Tue, 20 May 2014, Anatol Pomozov wrote:
>> On Tue, May 20, 2014 at 6:16 AM, Sebastian Ott
>> <sebott@...ux.vnet.ibm.com> wrote:
>> > On Tue, 20 May 2014, Sebastian Ott wrote:
>> >> On Mon, 19 May 2014, Benjamin LaHaise wrote:
>> >> > It is entirely possible the bug isn't
>> >> > caused by the referenced commit, as the commit you're pointing to merely
>> >> > makes io_destroy() syscall wait for all aio outstanding to complete
>> >> > before returning.
>> >>
>> >> I cannot reproduce this when I revert said commit (on top of 14186fe). If
>> >> that matters - the arch is s390.
>> >
>> > Hm, ok - maybe that commit is really just highlighting a refcounting bug.
>> > I just compared traces for a good and a few bad cases. The good case:
>> > # tracer: function
>> > #
>> > # entries-in-buffer/entries-written: 16/16   #P:4
>> > #
>> > #                              _-----=> irqs-off
>> > #                             / _----=> need-resched
>> > #                            | / _---=> hardirq/softirq
>> > #                            || / _--=> preempt-depth
>> > #                            ||| /     delay
>> > #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
>> > #              | |       |   ||||       |         |
>> >              fio-732   [003] ....    17.989315: kill_ioctx <-SyS_io_destroy
>> >              fio-739   [003] ....    18.000563: kill_ioctx <-SyS_io_destroy
>> >      ksoftirqd/3-19    [003] ..s.    18.031673: free_ioctx_users <-percpu_ref_kill_rcu
>> >      ksoftirqd/3-19    [003] ..s.    18.031679: free_ioctx_users <-percpu_ref_kill_rcu
>> >              fio-737   [003] ....    18.038765: kill_ioctx <-SyS_io_destroy
>> >      ksoftirqd/3-19    [003] ..s.    18.062488: free_ioctx_reqs <-percpu_ref_kill_rcu
>> >      ksoftirqd/3-19    [003] ..s.    18.062494: free_ioctx_reqs <-percpu_ref_kill_rcu
>> >      kworker/3:1-57    [003] ....    18.062499: free_ioctx <-process_one_work
>> >      kworker/3:1-57    [003] ....    18.062506: free_ioctx <-process_one_work
>> >      ksoftirqd/3-19    [003] ..s.    18.072275: free_ioctx_users <-percpu_ref_kill_rcu
>> >              fio-738   [003] ....    18.102419: kill_ioctx <-SyS_io_destroy
>> >           <idle>-0     [003] .ns.    18.111668: free_ioctx_reqs <-percpu_ref_kill_rcu
>> >      kworker/3:1-57    [003] ....    18.111675: free_ioctx <-process_one_work
>> >      ksoftirqd/3-19    [003] ..s.    18.138035: free_ioctx_users <-percpu_ref_kill_rcu
>> >           <idle>-0     [003] .ns.    18.191665: free_ioctx_reqs <-percpu_ref_kill_rcu
>> >      kworker/3:1-57    [003] ....    18.191671: free_ioctx <-process_one_work
>> >
>> > (4 fio workers, free_ioctx_reqs is called 4 times)
>> >
>> > One of the bad cases:
>> > # tracer: function
>> > #
>> > # entries-in-buffer/entries-written: 14/14   #P:4
>> > #
>> > #                              _-----=> irqs-off
>> > #                             / _----=> need-resched
>> > #                            | / _---=> hardirq/softirq
>> > #                            || / _--=> preempt-depth
>> > #                            ||| /     delay
>> > #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
>> > #              | |       |   ||||       |         |
>> >              fio-834   [000] ....    51.127359: kill_ioctx <-SyS_io_destroy
>> >           <idle>-0     [000] ..s.    51.170237: free_ioctx_users <-percpu_ref_kill_rcu
>> >              fio-828   [001] ....    51.189717: kill_ioctx <-SyS_io_destroy
>> >              fio-833   [001] ..s.    51.220178: free_ioctx_users <-percpu_ref_kill_rcu
>> >           <idle>-0     [000] .ns.    51.220230: free_ioctx_reqs <-percpu_ref_kill_rcu
>> >      kworker/0:3-661   [000] ....    51.220238: free_ioctx <-process_one_work
>> >           <idle>-0     [001] .ns.    51.260188: free_ioctx_reqs <-percpu_ref_kill_rcu
>> >      kworker/1:2-103   [001] ....    51.260198: free_ioctx <-process_one_work
>> >              fio-833   [002] ....    51.287602: kill_ioctx <-SyS_io_destroy
>> >            udevd-868   [002] ..s1    51.332519: free_ioctx_users <-percpu_ref_kill_rcu
>> >           <idle>-0     [002] .ns.    51.450180: free_ioctx_reqs <-percpu_ref_kill_rcu
>> >      kworker/2:2-191   [002] ....    51.450191: free_ioctx <-process_one_work
>> >              fio-835   [003] ....    51.907530: kill_ioctx <-SyS_io_destroy
>> >      ksoftirqd/3-19    [003] ..s.    52.000232: free_ioctx_users <-percpu_ref_kill_rcu
>> >
>> > (1 fio worker in D state, free_ioctx_reqs is called 3 times)
>>
>>
>> Looking at the second trace: the first 3 io_destroy() calls cause
>> free_ioctx_reqs(), but the last one does not call free_ioctx_reqs().
>> Do you have more logs after the last line?
>
> Nope that was all.
>
>> If there is no more
>> free_ioctx_reqs() then it means something keeps ctx->reqs refcounter.
>> I suggest to add some logging to kernel to figure out what is the
>> refcount value at this moment.
>
> Jep, I did that this morning (via atomic_read(&ctx->reqs.count) in
> free_ioctx_users before percpu_ref_kill(&ctx->reqs); is called) and
> the value was always the same
>         1 + (1UL << 31)
> even for the free_ioctx_users invocations that were not followed by
> free_ioctx_reqs.

Could you add atomic_read(&ctx->reqs.count) *after* the
percpu_ref_kill(&ctx->reqs)?

before percpu_ref_kill() the refcounter is per-CPU one and correctly
calculate the real value you need to take &ctx->reqs.pcpu_count into
account. But after percpu_ref_kill() the reqs.count become the normal
shared refcounter.

>
>>
>> But *if* there was one more free_ioctx_reqs() that does not
>> complete(ctx->requests_done) then it means the thread does not see
>> value of requests_done field. It would make me believe there is a
>> problem with memory reordering. I do not know whether s390 reorders
>> stores, but anyway, could you please check this patch?
>
> Nope s390 doesn't reorder stores - I tried the patch anyway, but it didn't
> help.
>
> Regards,
> Sebastian
>>
>> anatol:linux $ git diff
>> diff --git a/fs/aio.c b/fs/aio.c
>> index a0ed6c7..6e3c0bb 100644
>> --- a/fs/aio.c
>> +++ b/fs/aio.c
>> @@ -514,6 +514,7 @@ static void free_ioctx_reqs(struct percpu_ref *ref)
>>         struct kioctx *ctx = container_of(ref, struct kioctx, reqs);
>>
>>         /* At this point we know that there are no any in-flight requests */
>> +       rmb();
>>         if (ctx->requests_done)
>>                 complete(ctx->requests_done);
>>
>> @@ -758,6 +759,7 @@ static void kill_ioctx(struct mm_struct *mm,
>> struct kioctx *ctx,
>>                         vm_munmap(ctx->mmap_base, ctx->mmap_size);
>>
>>                 ctx->requests_done = requests_done;
>> +               wmb();
>>                 percpu_ref_kill(&ctx->users);
>>         } else {
>>                 if (requests_done)
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ