lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1388422545.26796.36.camel@joe-AO722>
Date:	Mon, 30 Dec 2013 08:55:45 -0800
From:	Joe Perches <joe@...ches.com>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	akpm@...ux-foundation.org, geert@...ux-m68k.org, jkosina@...e.cz,
	viro@...iv.linux.org.uk, davem@...emloft.net,
	keescook@...omium.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] lib/vsprintf: add %pT[C012] format specifier

On Sun, 2013-12-29 at 21:13 +0900, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Joe Perches wrote:
> > > On Sat, 2013-12-28 at 12:08 -0800, Andrew Morton wrote:
> > > > On Sat, 28 Dec 2013 11:53:25 -0800 Joe Perches <joe@...ches.com> wrote:
> > > > 
> > > > > > > #define PRINTK_PID      "\002"
> > > > > > > #define PRINTK_TASK_ID  "\003"  /* "comm:pid" */
> > > 
> > > > > > >
> > > > > > >         printk(PRINTK_TASK_ID ": hair on fire\n");
> > > > > > >
> > > > > > > It's certainly compact.  I doubt if there's any existing code which
> > > > > > > deliberately prints control chars?
> > 
> > What about using bytes from \x7F to \xFF ?
> 
> Here is a draft patch. With this approach, we can embed whatever variables
> vsnprintf() can access (not only current thread's attributes seen from init and
> current namespace but also other variables like current time) into the format
> string without passing corresponding argument.
> 
> Is this approach acceptable? (Of course, I'll add lines like
> 
>   #define PRINT_FORMAT    "\x7F"
>   #define PRINT_TASK_COMM "\x80"
>   #define PRINT_TASK_PID  "\x81"
> 
> for final version.) What variables (e.g. current->comm, current->pid,
> task_tgid_vnr(current)) do we want to pass using builtin tokens?
[]
> The vast majority of ->comm accesses are accessing current->comm, for
> debug reasons. I'm counting 350-odd sites. At all these sites it's
> pointless passing `current' to the printk function at all!

I get:

$ grep-2.5.4 -rP --include=*.[ch] -oh \
  "\b(?:printk|[a-z_]+_(?:printk|emerg|alert|crit|err|warn|notice|info|debug|dbg))[^;]*\bcurrent->[\w_]+" * | \
  grep -P -oh "\bcurrent->[\w_]+"| sort | uniq -c | sort -rn
    380 current->pid
    267 current->comm
     34 current->mm
     25 current->thread
     17 current->journal_info
      8 current->flags
      7 current->tgid
      5 current->active_mm
      2 current->sas_ss_sp
      2 current->addr_limit
      1 current->uid
      1 current->signal
      1 current->sas_ss_size
      1 current->ret_stack
      1 current->memcg_batch
      1 current->kernel_stack_page
      1 current->group_leader
      1 current->exit_code
      1 current->cred
      1 current->blocked

> This patch introduces a set of vsprintf tokens for embedding globally visible
> arguments into the format string, so that we can reduce text size a bit (and
> in the future make reading of task_struct->comm consistent) by stop generating
> the code to pass an argument which the callee already has access to, with an
> assumption that we can avoid using bytes >= 0x7F within the format string.
> 
> Some examples for converting direct ->comm users are shown below.
> 
>   pr_info("comm=%s\n", current->comm); => pr_info("comm=\x80\n");
>   pr_info("pid=%d\n", current->pid);   => pr_info("pid=\x81\n");
>   pr_info("%s(%d)\n", current->comm, current->pid); => pr_info("\x80(\x81)\n");
>   pr_info("[%-6.6s]\n", current->comm); => pr_info("\x7F-6.6\x80\n");

I too prefer using the #define CURRENT_<FOO> string approach
rather than direct embedding of string.

	pr_info("comm=" CURRENT_COMM "\n");

Also I prefer using ASCII SUB (26 \x1a \032 ^z) or maybe
PU1 - 145 or PU2 - 146, as an initiator byte as it takes
up much less of the control word space instead of using
multiple values like \x80, \x81, \x82, etc.  Using an
initiator byte seems more extensible too.

http://en.wikipedia.org/wiki/C0_and_C1_control_codes


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ