lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <200709211651.51851.vda.linux@googlemail.com>
Date:	Fri, 21 Sep 2007 16:51:51 +0100
From:	Denys Vlasenko <vda.linux@...glemail.com>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 7/8] Immediate Values - Documentation

On Friday 21 September 2007 14:31, Mathieu Desnoyers wrote:
> > Immediates make code bigger, right?
> 
> Nope.
> 
> Example:
> 
> char x;
> 
> void testb(void)
> {
>         if (x > 5)
>                 testa();
> }
> 
> Would turn into:
>   56:   b0 00                   mov    $0x0,%al
>   58:   3c 05                   cmp    $0x5,%al
>   5a:   7e 05                   jle    61 <testb+0x11>
> 
> (6 bytes)
> 
> Rather than:
> 
>   56:   80 3d 00 00 00 00 05    cmpb   $0x5,0x0
>   5d:   7e 05                   jle    64 <testb+0x14>
> 
> (9 bytes)

For 32-bit value, you won't be so lucky.

> So actually, immediate values well used make the code smaller. By the
> way, I recommend using the smallest immediate values required, which
> will often be a single byte.

I agree on this wholeheartedy. However, current kernel mostly uses int
even for yes/no style flags.

> > getppid is one of the lightest syscalls out there.
> > What kind of speedup do you see on a real-world test
> > (two processes exchaging data through pipes, for example)?
> > 
> 
> With the size of the caches we currently have, that kind of workload
> will not show any measurable difference: the signal/noise ratio is way
> to small to detect that kind of performance difference under such
> workload. Try it if you want.

Exactly my point: this speedup is not measurable on realistic workload.

> The real-world speedup I am interested into is to have almost -zero-
> tracer impact, which imples being undetectable even in the smallest and
> shortest functions. I guess nobody is interested in adding a measurable
> performance hit to kmalloc fast path, right?
> 
> > > +Therefore, not only is it interesting to use the immediate values to dynamically
> > > +activate dormant code such as the markers, but I think it should also be
> > > +considered as a replacement for many of the "read mostly" static variables.
> > 
> > What effect that will have on "size vmlinux" on AMD64?
> 
> Without considering kernel/immediate.o, it will make the code smaller
> and add 3*8bytes=24bytes of data in the __immediate section per
> immediate value reference (data only used for updates).

Yes. *Per immediate value reference*.

Therefore I don't think it's wise to recommend to use __immediate
for any variables which are referenced many times. "Many" defined as
"more than ten".

IOW: I think that this last paragraph shouldn't be there:

On Tuesday 18 September 2007 22:07, Mathieu Desnoyers wrote:
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
> ---
>  Documentation/immediate.txt |  228 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 228 insertions(+)
>...
> +Therefore, not only is it interesting to use the immediate values to dynamically
> +activate dormant code such as the markers, but I think it should also be
> +considered as a replacement for many of the "read-mostly" static variables.


A few crazy ideas how you can make it slightly less painful for 64-bit arch:

* Pack last long ('size') into low bits of other fields.
  (I expect link stage problems, tho)


* Make last field uint8_t and pack whole struct into 17 bytes (__attribute__((packed)))
  instead of 24 bytes.
  Expect align-happy folks faint left and right at such horrendous crime :) but
  other than that, it will work. Updates of immediates will *maybe* get a tiny bit slower
  (which is unimportant anyway).

  [btw, this can be done for i386 too]


* Turn long's into int32_t, since kernel's text addresses (at least on AMD64)
  fit into int32_t (sign-extend will give you correct 64-bit address):

  ffffffff80200000 A _text
  ffffffff80200000 T startup_64
  ffffffff802000b7 t ident_complete
  ffffffff80200110 T secondary_startup_64
  ffffffff802001a8 T initial_code
  ffffffff802001b0 T init_rsp
  ffffffff802001b8 t bad_address
  ffffffff802001c0 T early_idt_handler

  [I hope there is suitable reloc type for AMD64 and ld won't complain]
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ