linux-kernel - Re: [PATCH] proc: faster /proc/*/status

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160807085322.GB1871@p183.telecom.by>
Date:	Sun, 7 Aug 2016 11:53:22 +0300
From:	Alexey Dobriyan <adobriyan@...il.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	viro@...iv.linux.org.uk
Subject: Re: [PATCH] proc: faster /proc/*/status

On Sat, Aug 06, 2016 at 08:16:27PM -0700, Andi Kleen wrote:
> Alexey Dobriyan <adobriyan@...il.com> writes:
> > -
> > +	seq_printf(m, "State:\t%s", get_task_state(p));
> > +
> > +	seq_puts(m, "\nTgid:\t");
> 
> The only different should be the format string.
> 
> Scanning the format string really shouldn't be that expensive?!?

Surprise, it is (see my reply to Al).

What seq_put_decimal_ull() did is the equivalent of

	seq << "foo";
	seq << bar;
	seq << '\n';

No precisions, not widths, no padding, no upper and lowercasing.

> It would be better if you could find out why that is slow and optimize
> it. Then you would benefit every seq_printf user, not just this
> special case.
> 
> Perhaps it could benefit from some of the bit masking tricks to
> scan the string with wider tests than a word.

And then what? Parsing format string is still be there.

This is first line of profile of the first function (format_decode)

       │     static noinline_for_stack
       │     int format_decode(const char *fmt, struct printf_spec *spec)
       │     {
 10.38 │       push   %rbp			<===
  1.07 │       mov    %rsp,%rbp
  1.09 │       push   %r12
  4.51 │       mov    %rsi,%r12
  1.40 │       push   %rbx
  1.86 │       mov    %rdi,%rbx
       │       sub    $0x8,%rsp

It is so bloated that gcc needs to be asked to not screw up with stack
size.