lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 21 Sep 2007 09:31:03 -0400
From:	Mathieu Desnoyers <>
To:	Denys Vlasenko <>
Subject: Re: [patch 7/8] Immediate Values - Documentation

* Denys Vlasenko ( wrote:
> On Monday 27 August 2007 16:59, Mathieu Desnoyers wrote:
> > +We can therefore affirm that adding 2 markers to getppid, on a system with high
> > +memory pressure, would have a performance hit of at least 6.0% on the system
> > +call time, all within the uncertainty limits of these tests. The same applies to
> > +other kernel code paths. The smaller those code paths are, the highest the
> > +impact ratio will be.
> Immediates make code bigger, right?



char x;

void testb(void)
        if (x > 5)

Would turn into:
  56:   b0 00                   mov    $0x0,%al
  58:   3c 05                   cmp    $0x5,%al
  5a:   7e 05                   jle    61 <testb+0x11>

(6 bytes)

Rather than:

  56:   80 3d 00 00 00 00 05    cmpb   $0x5,0x0
  5d:   7e 05                   jle    64 <testb+0x14>

(9 bytes)

So actually, immediate values well used make the code smaller. By the
way, I recommend using the smallest immediate values required, which
will often be a single byte.

> What will happen on a system with high *icache* pressure?

It *helps* :) And by the way, icache on recent x86 and x86_64 is a trace
cache, so I don't see your point anyway.

> There a lot of inline happy and/or C++ folks out there
> in the userland, they routinely have programs in tens of megabytes range.
> getppid is one of the lightest syscalls out there.
> What kind of speedup do you see on a real-world test
> (two processes exchaging data through pipes, for example)?

With the size of the caches we currently have, that kind of workload
will not show any measurable difference: the signal/noise ratio is way
to small to detect that kind of performance difference under such
workload. Try it if you want.

The real-world speedup I am interested into is to have almost -zero-
tracer impact, which imples being undetectable even in the smallest and
shortest functions. I guess nobody is interested in adding a measurable
performance hit to kmalloc fast path, right ?

> > +Therefore, not only is it interesting to use the immediate values to dynamically
> > +activate dormant code such as the markers, but I think it should also be
> > +considered as a replacement for many of the "read mostly" static variables.
> What effect that will have on "size vmlinux" on AMD64?

Without considering kernel/immediate.o, it will make the code smaller
and add 3*8bytes=24bytes of data in the __immediate section per
immediate value reference (data only used for updates).


> --
> vda

Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Powered by blists - more mailing lists