lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 18 Jul 2016 16:16:08 +0200
From:	Philipp Zabel <p.zabel@...gutronix.de>
To:	Ricardo Ribalda Delgado <ricardo.ribalda@...il.com>
Cc:	Jonathan Corbet <corbet@....net>,
	Mauro Carvalho Chehab <mchehab@...nel.org>,
	Hans Verkuil <hverkuil@...all.nl>,
	Markus Heiser <markus.heiser@...marit.de>,
	Laurent Pinchart <laurent.pinchart@...asonboard.com>,
	Helen Mae Koike Fornazier <helen.koike@...labora.co.uk>,
	Antti Palosaari <crope@....fi>,
	Shuah Khan <shuahkh@....samsung.com>,
	linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
	linux-media <linux-media@...r.kernel.org>
Subject: Re: [PATCH v4 09/12] [media] vivid: Local optimization

Hi Ricardo,

Am Montag, den 18.07.2016, 15:21 +0200 schrieb Ricardo Ribalda Delgado:
> Hi Philipp
> 
> On Mon, Jul 18, 2016 at 3:13 PM, Philipp Zabel <p.zabel@...gutronix.de> wrote:
> > Since the constant expressions are evaluated at compile time, you are
> > not actually removing shifts. The code generated for precalculate_color
> > by gcc 5.4 even grows by one asr instruction with this patch.
> >
> 
> I dont think that I follow you completely here. The original code was

Sorry, I forgot to mention I compiled both versions for ARMv7-A, saw
that object size increased, had a look the diff between objdump -d
outputs and noticed an additional shift instruction. I have not checked
this for x86_64.

> if (a)
>    y= clamp(y, 16<<4, 235<<4)
> 
> y = clamp(y>>4, 1, 254)
>
> And now is
> 
> if (a)
>    y= clamp(y >>4, 16, 235)
> else
>     y = clamp(y, 1, 254)
     y = clamp(y >>4, 1, 254)

> On the previous case, when a was true there was 2 clamp operations.
> Now it is only one.

Yes. And now there's two shift operations (overall, still just one in
each conditional path).

It seems in my case the compiler was not clever enough to move all the
right shifts out of the conditional paths, so I ended up with one more
than before. You are right that in the limited range path the second
clamps are now avoided though. Basically, feel free to disregard my
comment.

I had the best looking result with this variant, btw:

	y >>= 4;
	cb >>= 4;
	cr >>= 4;
	if (tpg->real_quantization == V4L2_QUANTIZATION_LIM_RANGE) {
		y = clamp(y, 16, 235);
		cb = clamp(cb, 16, 240);
		cr = clamp(cr, 16, 240);
	} else {
		y = clamp(y, 1, 254);
		cb = clamp(cb, 1, 254);
		cr = clamp(cr, 1, 254);
	}

regards
Philipp

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ