lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 19 May 2009 17:03:46 -0400
From:	Jason Baron <jbaron@...hat.com>
To:	linux-kernel@...r.kernel.org
Cc:	fweisbec@...il.com, mingo@...e.hu, laijs@...fujitsu.com,
	rostedt@...dmis.org, peterz@...radead.org,
	mathieu.desnoyers@...ymtl.ca, jiayingz@...gle.com,
	mbligh@...gle.com, roland@...hat.com, fche@...hat.com
Subject: [PATCH 0/3] tracepoints: delay argument evaluation

hi,

After disassembling some of the tracepoints, I've noticed that arguments that
are passed as macros or that perform  dereferences, evaluate prior to the
tracepoint on/off check. This means that we are needlessly impacting the 
off case.

I am proposing to fix this by adding a macro that first checks for on/off and
then calls 'trace_##name', preserving type checking. Thus, callsites have to 
move from:

trace_block_bio_complete(md->queue, bio);

to:

tracepoint_call(block_bio_complete,  md->queue, bio);

I've tried '__always_inline', but that did not fix this issue. Obviously this
change will require changes to all the callsites. But, that shouldn't be
very hard, I've already included the scheduler and block changes with this
patch. I think its important to minimize code execution in the off case, and
thus going through all the callsites is well worth it. If we agree on this
change, I can change the rest in very short order.

Below I'm also showing the assembly in the 'dec_pending()' function before and
after this change to show the difference it makes. The arguments to the
tracepoint are as above, 'md->queue' and 'bio'. Notice the 2 extra instructions,
before the initial 'je', that could be moved after the 'je'.

before:

ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
ffffffff8137b2aa:       49 8b 45 50             mov    0x50(%r13),%rax
ffffffff8137b2ae:       48 89 45 d0             mov    %rax,-0x30(%rbp)
ffffffff8137b2b2:       74 1f                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2b4:       48 8b 1d d5 90 4b 00    mov    0x4b90d5(%rip),%rbx        # ffffffff81834390 <__tracepoint_block_bio_complete+0x10>
ffffffff8137b2bb:       48 85 db                test   %rbx,%rbx
ffffffff8137b2be:       74 13                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2c0:       4c 89 f6                mov    %r14,%rsi
ffffffff8137b2c3:       48 8b 7d d0             mov    -0x30(%rbp),%rdi
ffffffff8137b2c7:       ff 13                   callq  *(%rbx)
ffffffff8137b2c9:       48 83 c3 08             add    $0x8,%rbx
ffffffff8137b2cd:       48 83 3b 00             cmpq   $0x0,(%rbx)
ffffffff8137b2d1:       eb eb                   jmp    ffffffff8137b2be <dec_pending+0xec>
ffffffff8137b2d3:       44 89 fe                mov    %r15d,%esi

after:

ffffffff8137b2a3:       83 3d de 90 4b 00 00    cmpl   $0x0,0x4b90de(%rip)        # ffffffff81834388 <__tracepoint_block_bio_complete+0x8>
ffffffff8137b2aa:       74 27                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2ac:       49 8b 45 50             mov    0x50(%r13),%rax
ffffffff8137b2b0:       48 8b 1d d9 90 4b 00    mov    0x4b90d9(%rip),%rbx        # ffffffff81834390 <__tracepoint_block_bio_complete+0x10>
ffffffff8137b2b7:       48 89 45 d0             mov    %rax,-0x30(%rbp)
ffffffff8137b2bb:       48 85 db                test   %rbx,%rbx
ffffffff8137b2be:       74 13                   je     ffffffff8137b2d3 <dec_pending+0x101>
ffffffff8137b2c0:       4c 89 f6                mov    %r14,%rsi
ffffffff8137b2c3:       48 8b 7d d0             mov    -0x30(%rbp),%rdi
ffffffff8137b2c7:       ff 13                   callq  *(%rbx)
ffffffff8137b2c9:       48 83 c3 08             add    $0x8,%rbx
ffffffff8137b2cd:       48 83 3b 00             cmpq   $0x0,(%rbx)
ffffffff8137b2d1:       eb eb                   jmp    ffffffff8137b2be <dec_pending+0xec>
ffffffff8137b2d3:       44 89 fe                mov    %r15d,%esi


thanks,

-Jason



Jason Baron (3):
  -add wrapper so we don't have argument resolution overhead
  -add scheduler wrapper calls
  -add block layer trace wrappers

 block/blk-core.c           |   27 ++++++++++++++-------------
 block/elevator.c           |    6 +++---
 drivers/md/dm.c            |    7 ++++---
 fs/bio.c                   |    2 +-
 include/linux/tracepoint.h |   20 +++++++++++++++++++-
 kernel/exit.c              |    6 +++---
 kernel/fork.c              |    2 +-
 kernel/kthread.c           |    4 ++--
 kernel/sched.c             |   10 +++++-----
 kernel/signal.c            |    2 +-
 mm/bounce.c                |    2 +-
 11 files changed, 54 insertions(+), 34 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ