[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <336532d0-57f2-a430-d195-13c13f70e25a@collabora.com>
Date: Tue, 16 Aug 2016 12:46:51 -0400
From: Robert Foss <robert.foss@...labora.com>
To: Michal Hocko <mhocko@...nel.org>, sonnyrao@...omium.org
Cc: corbet@....net, akpm@...ux-foundation.org, vbabka@...e.cz,
koct9i@...il.com, hughd@...gle.com, n-horiguchi@...jp.nec.com,
minchan@...nel.org, john.stultz@...aro.org,
ross.zwisler@...ux.intel.com, jmarchan@...hat.com,
hannes@...xchg.org, keescook@...omium.org, viro@...iv.linux.org.uk,
gorcunov@...nvz.org, plaguedbypenguins@...il.com,
rientjes@...gle.com, eric.engestrom@...tec.com, jdanis@...gle.com,
calvinowens@...com, adobriyan@...il.com, jann@...jh.net,
kirill.shutemov@...ux.intel.com, ldufour@...ux.vnet.ibm.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
Ben Zhang <benzh@...omium.org>,
Bryan Freed <bfreed@...omium.org>,
Filipe Brandenburger <filbranden@...omium.org>,
Mateusz Guzik <mguzik@...hat.com>
Subject: Re: [PACTH v2 0/3] Implement /proc/<pid>/totmaps
On 2016-08-16 03:12 AM, Michal Hocko wrote:
> On Mon 15-08-16 12:25:10, Robert Foss wrote:
>>
>>
>> On 2016-08-15 09:42 AM, Michal Hocko wrote:
> [...]
>>> The use case is to speed up monitoring of
>>> memory consumption in environments where RSS isn't precise.
>>>
>>> For example Chrome tends to many processes which have hundreds of VMAs
>>> with a substantial amount of shared memory, and the error of using
>>> RSS rather than PSS tends to be very large when looking at overall
>>> memory consumption. PSS isn't kept as a single number that's exported
>>> like RSS, so to calculate PSS means having to parse a very large smaps
>>> file.
>>>
>>> This process is slow and has to be repeated for many processes, and we
>>> found that the just act of doing the parsing was taking up a
>>> significant amount of CPU time, so this patch is an attempt to make
>>> that process cheaper.
>
> Well, this is slow because it requires the pte walk otherwise you cannot
> know how many ptes map the particular shared page. Your patch
> (totmaps_proc_show) does the very same page table walk because in fact
> it is unavoidable. So what exactly is the difference except for the
> userspace parsing which is quite trivial e.g. my currently running Firefox
> has
> $ awk '/^[0-9a-f]/{print}' /proc/4950/smaps | wc -l
> 984
>
> quite some VMAs, yet parsing it spends basically all the time in the kernel...
>
> $ /usr/bin/time -v awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}' /proc/4950/smaps
> rss:1112288 pss:1096435
> Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss} /proc/4950/smaps"
> User time (seconds): 0.00
> System time (seconds): 0.02
> Percent of CPU this job got: 91%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.02
>
> So I am not really sure I see the performance benefit.
>
I did some performance measurements of my own, and it would seem like
there is about a 2x performance gain to be had. To me that is
substantial, and a larger gain than commonly seen.
There naturally also the benefit that this is a lot easier to interact
with programmatically.
$ ps aux | grep firefox
robertfoss 5025 24.3 13.7 3562820 2219616 ? Rl Aug15 277:44
/usr/lib/firefox/firefox https://allg.one/xpb
$ awk '/^[0-9a-f]/{print}' /proc/5025/smaps | wc -l
1503
$ /usr/bin/time -v -p zsh -c "(repeat 25 {cat /proc/5025/totmaps})"
[...]
Command being timed: "zsh -c (repeat 25 {cat /proc/5025/totmaps})"
User time (seconds): 0.00
System time (seconds): 0.40
Percent of CPU this job got: 90%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45
$ /usr/bin/time -v -p zsh -c "repeat 25 { awk '/^Rss/{rss+=\$2}
/^Pss/{pss+=\$2} END {printf \"rss:%d pss:%d\n\", rss, pss}\'
/proc/5025/smaps }"
[...]
Command being timed: "zsh -c repeat 25 { awk '/^Rss/{rss+=$2}
/^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}\'
/proc/5025/smaps }"
User time (seconds): 0.37
System time (seconds): 0.45
Percent of CPU this job got: 92%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.89
Powered by blists - more mailing lists