lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190324223202.GA875@sultan-box.localdomain>
Date:   Sun, 24 Mar 2019 15:32:02 -0700
From:   Sultan Alsawaf <sultan@...neltoast.com>
To:     Rasmus Villemoes <linux@...musvillemoes.dk>
Cc:     akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
        Nathan Chancellor <natechancellor@...il.com>
Subject: Re: [RFCv2] string: Use faster alternatives when constant arguments
 are used

On Sun, Mar 24, 2019 at 10:17:49PM +0100, Rasmus Villemoes wrote:
> gcc already knows the semantics of these functions and can optimize
> accordingly. E.g. for strcpy() of a literal to a buffer, gcc readily
> compiles

The example you gave appears to get optimized accordingly, but there are
numerous sane counterexamples that don't get optimized.

On arm64, with GCC 8.3.0, let's look at the simple strcmp usage in
kernel/trace/preemptirq_delay_test.c:

static int preemptirq_delay_run(void *data)
{
	unsigned long flags;

	if (!strcmp(test_mode, "irq")) {
		local_irq_save(flags);
		busy_wait(delay);
		local_irq_restore(flags);
	} else if (!strcmp(test_mode, "preempt")) {
		preempt_disable();
		busy_wait(delay);
		preempt_enable();
	}

	return 0;
}

This is what happens without my patch:

preemptirq_delay_run:
.LFB2517:
	.cfi_startproc
	stp	x29, x30, [sp, -32]!
	.cfi_def_cfa_offset 32
	.cfi_offset 29, -32
	.cfi_offset 30, -24
	adrp	x1, .LC0
	add	x1, x1, :lo12:.LC0
	mov	x29, sp
	stp	x19, x20, [sp, 16]
	.cfi_offset 19, -16
	.cfi_offset 20, -8
	adrp	x19, .LANCHOR0
	add	x19, x19, :lo12:.LANCHOR0
	mov	x0, x19
>	bl	strcmp
	cbz	w0, .L22
	adrp	x1, .LC1
	mov	x0, x19
	add	x1, x1, :lo12:.LC1
>	bl	strcmp
	cbz	w0, .L23

The calls to strcmp are not optimized out. Here is what this snippet looks like
after my patch:

preemptirq_delay_run:
.LFB2517:
	.cfi_startproc
	stp	x29, x30, [sp, -32]!
	.cfi_def_cfa_offset 32
	.cfi_offset 29, -32
	.cfi_offset 30, -24
	adrp	x0, .LANCHOR0
	mov	w1, 29289
	mov	x29, sp
	ldr	w2, [x0, #:lo12:.LANCHOR0]
	movk	w1, 0x71, lsl 16
	add	x3, x0, :lo12:.LANCHOR0
	cmp	w2, w1
	beq	.L23
	ldr	x1, [x0, #:lo12:.LANCHOR0]
	mov	x0, 29296
	movk	x0, 0x6565, lsl 16
	movk	x0, 0x706d, lsl 32
	movk	x0, 0x74, lsl 48
	cmp	x1, x0
	beq	.L24

The calls to strcmp were optimized out completely, not even replaced by a call
to memcmp.

I can produce lots of kernel examples that don't seem to follow basic str*
optimization, which is what prompted me to write this in the first place.

> Does this even compile? It's a well-known (or perhaps
> not-so-well-known?) pitfall that __builtin_constant_p() is not
> guaranteed to be usable in __builtin_choose_expr() - the latter only
> accepts bona fide integer constant expressions, while evaluation of
> __builtin_constant_p can be delayed until various optimization phases.

It not only compiles, but also seems to work correctly from what I've observed.

> This seems to be buggy - you don't know that src is at least as long as
> dest. And arguing "if it's shorter, there's a nul byte, which will
> differ from dest at that point, so memcmp will/should stop" means that
> the whole word-at-a-time argument would be out.

You mean reading the last word of a string could read out of bounds of the
string when using memcmp? There are lots of memcmp instances using a literal
string in the kernel; I don't think it would be hard to find one that violates
what you've pointed out, so I'm not really sure why it's a problem.

After a few minutes of grepping, it seems like the memcmp usage in
drivers/md/dm-integrity.c can read out of bounds on its arguments:
} else if (!memcmp(opt_string, "internal_hash:", strlen("internal_hash:"))) {
	r = get_alg_and_key(opt_string, &ic->internal_hash_alg, &ti->error,
			    "Invalid internal_hash argument");
	if (r)
		goto bad;
} else if (!memcmp(opt_string, "journal_crypt:", strlen("journal_crypt:"))) {
	r = get_alg_and_key(opt_string, &ic->journal_crypt_alg, &ti->error,
			    "Invalid journal_crypt argument");
	if (r)
		goto bad;
} else if (!memcmp(opt_string, "journal_mac:", strlen("journal_mac:"))) {

Where opt_string is a dynamically-set argument with no specified minimum length.

Sultan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ