lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-id: <47B67C06.6060706@mail.usask.ca>
Date:	Sat, 16 Feb 2008 00:00:38 -0600
From:	Robert Hancock <rwh461@...l.usask.ca>
To:	Dan Gora <dan.gora@...il.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: PCI Bursting with PIO

Dan Gora wrote:
> Hi,
> 
> I am trying to optimize a driver for a slave only PCI device and am
> having a lot of trouble getting any kind of PCI burst transactions in
> either the read or the write direction.  Using bcopy/memcpy or even a
> hand-crafted while (len) { *pdst++ = *psrc++} (with pdst and psrc
> unsigned long*) I can only get writes to burst and even in that case
> only for 2 data phases (8 bytes) and only on 64 bit machines.  The
> best that I have managed is to use a hand crafted asm function which
> copies the data through mmx registers on i386 machines, but that still
> only bursts a maximum of 16 bytes in the write direction and not at
> all in the read direction.  The source and destination pointers are
> both aligned to 8 byte boundaries, so I don't think that it's an
> alignment issue.

The chipset is being limited by what the CPU is giving it. If the CPU 
sends only a small amount of data in one access then the chipset usually 
does not try to burst more than that.

> 
> Is there any way to get PIO to burst over the PCI bus in the read and
> write direction?  My device has 4 BAR registers, but the area where I
> am transferring data is marked 'prefetchable' (although the others are
> not).  I read here: http://lkml.org/lkml/2004/9/23/393 that this was a
> prerequisite, but it is apparently not sufficient.  He also mentioned
> that the area had to be marked as write-back, but it's not clear how
> you can tell (no /proc/mtrr doesn't tell you) or that it has anything
> to do with bursting reads.
> 
> Any ideas would be really appreciated,

Well, in order for the CPU to batch up more writes you'd have to map the 
  BAR as either write-combining or write-back. If it's not listed in 
/proc/mtrr it will be the default setting of uncacheable. X has code to 
set up the video memory on the video card as write-combining so it can 
get better write performance, you could do something similar.

Setting it as write-back might allow you to get the reads to do bursting 
  as well (since the CPU will do a cache-line fill instead of individual 
accesses) but this if the device is modifying this memory area, unless 
you add code to invalidate those cache lines before reading the data 
you'll get stale data back. You could run into some other less obvious 
issues as well, as normally device memory regions are not mapped write-back.

In general, especially if you need to read data back from the device, 
implementing a DMA engine would be by far the better option. Most 
chipsets seem not at all optimized for handling sequential reads from 
PCI memory from the CPU. (Even in the DMA case, you have to be careful 
with what type of memory read transaction you use when transferring from 
host memory - some chipsets don't like to burst more than one cycle if 
you use normal Memory Read instead of Memory Read Line or Memory Read 
Multiple.)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ