lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d2d133e5-267c-c42a-7329-22892bbbdee4@synopsys.com>
Date:   Thu, 29 Nov 2018 16:13:04 +0000
From:   Jose Abreu <jose.abreu@...opsys.com>
To:     David Laight <David.Laight@...LAB.COM>,
        "linux-snps-arc@...ts.infradead.org" 
        <linux-snps-arc@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:     Vineet Gupta <vineet.gupta1@...opsys.com>,
        Alexey Brodkin <alexey.brodkin@...opsys.com>,
        Joao Pinto <joao.pinto@...opsys.com>,
        "Vitor Soares" <vitor.soares@...opsys.com>
Subject: Re: [PATCH v2] ARC: io.h: Implement reads{x}()/writes{x}()

On 29-11-2018 14:42, Jose Abreu wrote:
> On 29-11-2018 14:38, David Laight wrote:
>> From: Jose Abreu
>>> Sent: 29 November 2018 14:29
>>>
>>> Some ARC CPU's do not support unaligned loads/stores. Currently, generic
>>> implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
>>> This can lead to misfunction of some drivers as generic functions do a
>>> plain dereference of a pointer that can be unaligned.
>>>
>>> Let's use {get/put}_unaligned() helper instead of plain dereference of
>>> pointer in order to fix this.
>> ...
>>> +#define __raw_readsx(t,f) \
>>> +static inline void __raw_reads##f(const volatile void __iomem *addr, \
>>> +				  void *buffer, unsigned int count) \
>>> +{ \
>>> +	if (count) { \
>>> +		const unsigned long bptr = (unsigned long)buffer; \
>>> +		u##t *buf = buffer; \
>>> +\
>>> +		do { \
>>> +			u##t x = __raw_read##f(addr); \
>>> +\
>>> +			/* Some ARC CPU's don't support unaligned accesses */ \
>>> +			if (bptr % ((t) / 8)) { \
>>> +				put_unaligned(x, buf++); \
>>> +			} else { \
>>> +				*buf++ = x; \
>>> +			} \
>>> +		} while (--count); \
>>> +	} \
>>> +}
>> Does the compiler move the alignment test outside the loop?
>> You really want two copies of the loop body.
> Hmm, I would expect so because the if condition takes two const
> args ... I will try check that.

And it did optimize :)

Sample C Source:
--->8--
static noinline void test_readsl(char *buf, int len)
{
        readsl(0xdeadbeef, buf, len);
}
--->8---

And the disassembly:
--->8---
00000e88 <test_readsl>:
 e88:    breq.dr1,0,eac <0xeac>        /* if (count) */
 e8c:    and r2,r0,3

 e90:    mov_s lp_count,r1            /* r1 = count */
 e92:    brne r2,0,eb0 <0xeb0>        /* if (bptr % ((t) / 8)) */

 e96:    sub r0,r0,4
 e9a:    nop_s
 
 e9c:    lp eac <0xeac>                /* first loop */
 ea0:    ld r2,[0xdeadbeef]
 ea8:    st.a r2,[r0,4]
 eac:    j_s [blink]
 eae:    nop_s

 eb0:    lp ed6 <0xed6>                /* second loop */
 eb4:    ld r2,[0xdeadbeef]
 ebc:    lsr r5,r2,8
 ec0:    lsr r4,r2,16
 ec4:    lsr r3,r2,24
 ec8:    stb_s r2,[r0,0]
 eca:    stb r5,[r0,1]
 ece:    stb r4,[r0,2]
 ed2:    stb_s r3,[r0,3]
 ed4:    add_s r0,r0,4
 ed6:    j_s [blink]

--->8---

See how the if condition added in this version is checked in
<test_readsl+0xe92> and then it takes two different loops.

Thanks and Best Regards,
Jose Miguel Abreu

>
> Thanks and Best Regards,
> Jose Miguel Abreu
>
>> 	David
>>
>> -
>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
>> Registration No: 1397386 (Wales)
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ