linux-kernel - Re: [PATCH 1/1] suspend: delete sys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJvTdKn1kJ=2knUnXgO+6iK6m7Ns6q1aLT2RCeERU5+gJnzmqw@mail.gmail.com>
Date:	Sat, 20 Jun 2015 01:26:22 -0400
From:	Len Brown <lenb@...nel.org>
To:	Dave Chinner <david@...morbit.com>
Cc:	NeilBrown <neilb@...e.de>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	Ming Lei <tom.leiming@...il.com>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Linux PM List <linux-pm@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Len Brown <len.brown@...el.com>
Subject: Re: [PATCH 1/1] suspend: delete sys_sync()

>> Putting a function trace on sys_sync and executing sync manually,
>> I was able to see it take 100ms,
>> though function trace itself could be contributing to that...
>
> It would seem that way - you need to get the traces to dump to
> something that has no sync overhead....

I don't think that ftrace buffers have sync overhead --
at least not until they are copied into files.

>> Curiously, in another run, sync ran at 15ms, but sd suspend exploded to 300ms.
>> I've seen that in some other results.  Sometimes sync if fast, but sd
>> then more than makes up for it by being slow:-(
>
> Oh, I see that too. Normally That's because the filesystem hasn't
> been told to enter an idle state and so is doing metadata writeback
> IO after the sync. When that happens the sd suspend has wait for
> request queues to drain, IO to complete and device caches to flush.
> This simply cannot be avoided because suspend never tells the
> filesytems to enter an idle state....

I captured a trace of a slow sd suspend.
Apparently, it does two operations -- first  a sync_cache,
then a stop operation.  The sync was fast.   The stop command
was where all the time went.

I'll look at a more modern drive on the same system next week,
just for comparison.

> i.e. remember what I said initially in this thread about suspend
> actually needing to freeze filesystems, not just sync them?

I think with the complexity of file systems and the underlying
devices, yes, we need to think about how to efficiently
and safely suspend/resume them.

But sys_sync is too expensive to have hard-coded in the kernel suspend path.
Some machines can suspend and resume in under 10ms -- they
absolutely do not want sys_sync() hard-coded in the suspend path.

>> FYI,
>> I ran analyze_suspend.py -x2
>> from current directory /tmp, which is mounted on tmpfs,
>> but still found the 2nd sync was very slow -- 200ms
>> vs 6 - 20 ms for the sync preceding the 1st suspend.
>
> So where did that time go? As I pointed out previously, function
> trace will only tell us if the delay is data writeback or not. We
> seem to have confirmed that the delay is, indeed, writeback of dirty
> data. Now we need to identify what the dirty data belongs to: we
> need to trace individual writeback events to see what dirty inodes
> are actually being written.

I expect that analyze_suspend.py is moving data around between
the back-to-back suspends when the -x2 option is used -- will look into it.

thanks,
Len Brown, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/