[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48A32976.7060504@vlnb.net>
Date: Wed, 13 Aug 2008 22:35:34 +0400
From: Vladislav Bolkhovitin <vst@...b.net>
To: David Miller <davem@...emloft.net>
CC: open-iscsi@...glegroups.com, rdreier@...co.com, rick.jones2@...com,
jgarzik@...ox.com, Steve Wise <swise@...ngridcomputing.com>,
Karen Xie <kxie@...lsio.com>, netdev@...r.kernel.org,
michaelc@...wisc.edu, daisyc@...ibm.com, wenxiong@...ibm.com,
bhua@...ibm.com, Dimitrios Michailidis <dm@...lsio.com>,
Casey Leedom <leedom@...lsio.com>, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH 1/1] cxgb3i: cxgb3 iSCSI initiator
Divy Le Ray wrote:
> On Tuesday 12 August 2008 03:02:46 pm David Miller wrote:
>> From: Divy Le Ray <divy@...lsio.com>
>> Date: Tue, 12 Aug 2008 14:57:09 -0700
>>
>>> In any case, such a stateless solution is not yet designed, whereas
>>> accelerated iSCSI is available now, from us and other companies.
>> So, WHAT?!
>>
>> There are TOE pieces of crap out there too.
>
> Well, there is demand for accerated iscsi out there, which is the driving
> reason of our driver submission.
I'm, as an iSCSI target developer, strongly voting for hardware iSCSI
offload. Having possibility of the direct data placement is a *HUGE*
performance gain.
For example, according to measurements done by one iSCSI-SCST user in
system with iSCSI initiator and iSCSI target (with iSCSI-SCST
(http://scst.sourceforge.net/target_iscsi.html) running), both with
identical modern high speed hardware and 10GbE cards, the _INITIATOR_ is
the bottleneck for READs (data transfers from target to initiator). This
is because the target sends data in a zero-copy manner, so its CPU is
capable to deal with the load, but on the initiator there are additional
data copies from skb's to page cache and from page cache to application.
As the result, in the measurements initiator got near 100% CPU load and
only ~500MB/s throughput. Target had ~30% CPU load. For the opposite
direction (WRITEs), where there is no the application data copy on the
target, throughput was ~800MB/s with also near 100% CPU load, but in
this case on the target. The initiator ran Linux with open-iscsi. The
test was with real backstorage: target ran BLOCKIO (direct BIOs to/from
backstorage) with 3ware card. Locally on the target the backstorage was
able to provide 900+MB/s for READs and about 1GB/s for WRITEs. The
commands queue in both cases was sufficiently big to eliminate the link
and processing latencies (20-30 outstanding commands).
Vlad
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists