lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200902100045.44386.nickpiggin@yahoo.com.au>
Date:	Tue, 10 Feb 2009 00:45:42 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Federico Cuello <fedux@...men.org.ar>
Cc:	Ralf Hildebrandt <Ralf.Hildebrandt@...rite.de>,
	Artem Bityutskiy <Artem.Bityutskiy@...ia.com>,
	linux-kernel@...r.kernel.org
Subject: Re: sync-Regression in 2.6.28.2?

On Thursday 05 February 2009 22:54:16 Federico Cuello wrote:
> Nick Piggin wrote:
> > On Thursday 05 February 2009 04:31:00 Federico Cuello wrote:
> >> Nick Piggin wrote:
> >>> [...]
> >>> Thanks, could you reply-to-all when replying to retain ccs please?
> >>>
> >>> Common theme is ext4, which uses no_nrwrite_index_update, and I
> >>> introduced a bug in there which could possibly cause ext4 to go into a
> >>> loop...
> >>>
> >>> Would it be possible if you can test the following patch?
> >>
> >> I'll test it as soon as I get home.
> >
> > Thanks.
> >
> >> Meanwhile, I think the new patch may be slightly wrong. If I understand
> >> correctly PageWriteback(page) is called before nr_to_write is tested for
> >> being > 0 and then decremented if true, but "done" is  not set to 1
> >> until the next iteration. So another call to PageWriteback(page) while
> >> take place and then "done" will be set to true (if wbc->sync_mode ==
> >> WB_SYNC_NONE).
> >>
> >> If nr_to_write == 1 at the beginning of the loop then two pages will be
> >> written.
> >>
> >> I think the test condition should something like:
> >>
> >> if (--nr_to_write <= 0 && wbc->sync_mode == WB_SYNC_NONE) {
> >>     done = 1;
> >>     break;
> >> }
> >
> > I think you're quite right. Good catch. We probably want to prevent
> > nr_to_write from going -ve, though.
> >
> > I think something like this
> >
> >   if (nr_to_write > 0)
> >     nr_to_write--;
> >   if (!nr_to_write && wbc->sync_mode == WB_SYNC_NONE) {
> >     ...
> >
> > Would you care to send a patch?
>
> Ok, after extensive testing I haven't been able to solve the problem.
>
> I'm posting below the patch I used. I tried 3 different patches with one
> successful test run with the one you sent me. I don't know if it was
> just a coincidence as I had no time to test it again.
>
> Now, with the patch below, it stalls with 50% IO-wait (dual core, one
> core at 100%). Perhaps the patch is part of the solution, I don't know.
>
> I also have the sysrq-W logs and I'm also posting them below.
>
> Thanks,
>
>
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 08d2b96..9e2ae50 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -981,13 +981,23 @@ continue_unlock:
>                                 }
>                         }
>
> -                       if (wbc->sync_mode == WB_SYNC_NONE) {
> -                               wbc->nr_to_write--;
> -                               if (wbc->nr_to_write <= 0) {
> +                       if (nr_to_write > 0) {
> +                               nr_to_write--;
> +                               if (nr_to_write == 0 && wbc->sync_mode
> == WB_SYNC_NONE) {
> +                                       /*
> +                                        * We stop writing back only if
> we are not
> +                                        * doing integrity sync. In case
> of integrity
> +                                        * sync we have to keep going
> because someone
> +                                        * may be concurrently dirtying
> pages, and we
> +                                        * might have synced a lot of
> newly appeared
> +                                        * dirty pages, but have not
> synced all of the
> +                                        * old dirty pages.
> +                                        */
>                                         done = 1;
>                                         break;
>                                 }
>                         }
> +
>                         if (wbc->nonblocking && bdi_write_congested(bdi)) {
>                                 wbc->encountered_congestion = 1;
>                                 done = 1;

This patch seems good to me. If you would care to add a changelog and
Signed-off-by: line, then we could get it merged?

I am not too sure about this bug. I have reproduced a strange hang with
ext4 (which does include sys_sync and write_cache_pages traces), and
also turned up a lockdep report. Also, we haven't seen any reports of
this problem on other filesystems. So it could be an ext4 bug.

Your traces also have lots of tasks hung waiting for page lock. It is
possible that wakeups get lost, which is fixed by this commit in
mainline
777c6c5f1f6e757ae49ecca2ed72d6b1f523c007

Which might also be your bug.


Any chance you can test this patch (as well as the existing patches
you are using to fix write_cache_pages?).

Thanks,
Nick

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ