lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 May 2014 14:55:48 -0600
From:	Jason Gunthorpe <jgunthorpe@...idianresearch.com>
To:	Ezequiel Garcia <ezequiel.garcia@...e-electrons.com>
Cc:	Arnd Bergmann <arnd@...db.de>, Jingoo Han <jg1.han@...sung.com>,
	linux-kernel@...r.kernel.org, linux-mtd@...ts.infradead.org,
	Brian Norris <computersforpeace@...il.com>,
	David Woodhouse <dwmw2@...radead.org>,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 2/2] mtd: orion-nand: fix build error with ARMv4

On Fri, May 09, 2014 at 07:09:15PM -0300, Ezequiel Garcia wrote:
> On 09 May 03:28 PM, Jason Gunthorpe wrote:
> > 
> > > I gave this a try in order to answer Arnd's performance
> > > question. First of all, the patch seems wrong. I guess it's because
> > > readsl reads 4-bytes pieces, instead of 8-bytes.
> > > 
> > > This patch below is tested (but not completely, see below) and works:
> > 
> > Compilers are better now, I think you can just ditch the weirdness:
> > 
> [..]
> > 
> > The below gives:
> > 
> >   c8:   ea000002        b       d8 <orion_nand_read_buf+0x84>
> >   cc:   e5dc0000        ldrb    r0, [ip]
> >   d0:   e7c30001        strb    r0, [r3, r1]
> >   d4:   e2811001        add     r1, r1, #1
> >   d8:   e1510002        cmp     r1, r2
> > 
> > Which looks the same as the asm version to me.
> > 
> 
> Nice! It wasn't really needed but since I have the board here:
> 
> # time nanddump /dev/mtd5 -f /dev/null -q
> real	0m 5.82s
> user	0m 0.20s
> sys	0m 5.60s
> 
> Jason: Care to submit a proper patch?

Sure, but did anyone (Arnd?) have thoughts on a better way to do this:

+#ifdef CONFIG_64BIT
+               buf64[i++] = readq_relaxed(io_base);
+#else
+               buf64[i++] = *(const volatile u64 __force *)io_base;
+#endif

IMHO, readq should exist on any platform that can issue a 64 bit bus
transaction, and I expect many ARM's qualify.

> On 08 May 04:56 PM, Arnd Bergmann wrote:

> Ok, so it takes 5.6 seconds in kernel mode to access 31MB, which
> comes down to 5.60MB/s. That isn't very fast compared to the time
> the CPU should take for those instructions, so I'm surprised it
> actually makes any difference at all.

Likely, what is happening is that the bus interface is holding off
returning the read data until it complets the bus cycles, then the
response travels to the CPU which turns around another.

This creates a dead time where the bus isn't do anything.

The larger bus transfer the CPU can do the less percentage of time the
turnaround takes as overhead.

If the cpu could pipeline two reads then it could be highest-possible,
but I guess the memory ordering for the mapping prevents that??

Regarding DMA, who knows if the interface can handle a burst
transfer..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ