[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140722103443.GV30979@8bytes.org>
Date: Tue, 22 Jul 2014 12:34:44 +0200
From: Joerg Roedel <joro@...tes.org>
To: "Rafael J. Wysocki" <rjw@...ysocki.net>
Cc: Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>,
linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/6 v2] PM / Hibernate: Memory bitmap scalability
improvements
On Tue, Jul 22, 2014 at 02:41:29AM +0200, Rafael J. Wysocki wrote:
> It looks like some specific need motivated the Joerg's work, however,
> so let's just not dismiss the use case lightly without knowing it.
The motivation was to optimize the data structures for machines with
large amounts of RAM without penalizing average machines. On a 12TB
machine you are close to 100000 pages just for one bitmap. Scanning
through that linearly to find a given bit just doesnt scale anymore in
this case.
Same for the algorithm currently used in swsusp_free(). Scanning over
every pfn also doesn't scale well anymore in these ranges. I agree that
the optimizations are not noticable on average systems (see below), but
they are still measurable.
I also see how the problem could be solved differently, but what I
didn't get from the discussion yet is: What is actually *wrong* with
*this* approach?
> That said I would like to know how much time we save through this
> optimization relative to the total hibernation time on systems with
> various amounts of memory (say, 4 GB, 8 GB, 16 GB, 32 GB, more) and
> whether or not it makes hibernation slower in any case.
Okay, I tested on a 16GB system (actually 15GB, one GB is taken by the
GPU). Since the total time for hibernation depends not only on the
amount of RAM in the machine but more on the size of the hibernation
image and the speed of the disk, there is not much value in measuring a
complete resume cycle. The time needed there depends more on the system
and the work load than anything else.
So my test was to resume from a swap partition that contained no image.
Here is the result from the 16GB machine. First with a v3.16-rc6 kernel
without my changes:
kv:~/base # time perf record /usr/sbin/resume /dev/sda1
resume: libgcrypt version: 1.5.3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (~823 samples) ]
real 0m0.084s
user 0m0.012s
sys 0m0.064s
Here is the result with my patches on top:
kv:~/hibernate # time perf record /usr/sbin/resume /dev/sda1
resume: libgcrypt version: 1.5.3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.014 MB perf.data (~602 samples) ]
real 0m0.032s
user 0m0.003s
sys 0m0.027s
So we save around 50ms (or 62% of time) already on this 16GB machine.
Thanks,
Joerg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists