linux-kernel - Re: [Announce]: Target_Core_Mod/ConfigFS and LIO-Target v3.0 work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4947F72E.7000408@vlnb.net>
Date:	Tue, 16 Dec 2008 21:45:02 +0300
From:	Vladislav Bolkhovitin <vst@...b.net>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
CC:	Bart Van Assche <bart.vanassche@...il.com>,
	linux-iscsi-target-dev@...glegroups.com,
	LKML <linux-kernel@...r.kernel.org>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Christoph Hellwig <hch@...radead.org>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	Mike Christie <michaelc@...wisc.edu>,
	Hannes Reinecke <hare@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	scst-devel <scst-devel@...ts.sourceforge.net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [Announce]: Target_Core_Mod/ConfigFS and LIO-Target v3.0 work

James Bottomley wrote:
> On Sat, 2008-12-13 at 12:56 +0100, Bart Van Assche wrote:
>> On Sat, Dec 13, 2008 at 12:18 PM, Nicholas A. Bellinger
>> <nab@...ux-iscsi.org> wrote:
>>> Of course I fix bugs when people report them.
>> Things have changed then since the beginning of this year. As anyone
>> can see in the threads I referred to, you have done your best to deny
>> that the crashes and system hangs were caused by LIO, although I had
>> posted exact instructions on how to reproduce the bugs. Regarding
>> kernel integration and subsystem maintainership: one of the important
>> tasks of a maintainer is to verify whether reported bugs are
>> reproducible, and if so, to resolve them. I'm happy none of the
>> current kernel maintainers has the habitude of denying bug reports
>> that are 100% reproducible and which contain exact instructions about
>> how to reproduce the bug.
> 
> OK, All of you on this thread, why don't you take time out to step back
> and think about the effects this descent into trench warfare is having
> on your observers.

James,

I'm sorry you needed to intervene in such a manner. I don't want to 
continue that LIO vs SCST fight, but I see in your message some 
important misunderstandings about SCST, on which, I feel I need to reply 
to clean them up.

>      1. You're both saying the other side isn't production ready ...
>         it's not a stretch for the rest of us to take this at face
>         value ... about both of you.

I listed in http://lkml.org/lkml/2008/12/10/245 the exact things, why 
LIO is far from being production ready and can continue that list. In 
fact, if to call things their real names, LIO is an iSCSI target which 
in past few months in a hurry is being converted to a generic target 
engine and which has a lo-o-ong way to go to complete the conversion. 
I.e., in other words, LIO might be good as an iSCSI target, but as a 
generic iSCSI target engine at the moment it simply *does not exist* yet.

Relating to SCST being not production ready, can Nicholas Bellinger 
support his claims against SCST with something concrete? So far, 
everything he has written was empty words not supported by any real 
facts. For instance, he failed to describe for what all those "missed" 
in SCST features are needed.

>      2. This ideological opposition to features the other side
>         implements tells me that if it came to a choice, by going with
>         either one of you I'd get an incomplete feature set.

There's no ideological opposition between SCST and LIO. Both engines are 
built around basically the same ideology. The opposition is in 
completely different and non-technical area.

>      3. Making obvious partisans of your user base also tells me that if
>         I had to make a choice, whatever it was I'd piss off a large
>         number of people who'd be very vocal about it.

Unfortunately, being based on an Open Source product isn't something 
many people want to be proud of..

But here is the list of companies taken from scst-devel mailing list who 
are working on SCST based products and made contributions in the past 
half a year:

@storwize.com
@open-e.com
@enjellic.com

In the earlier time there were also contributions from @hp.com and 
@systemfabricworks.com.

Also, I've already mentioned Mellanox, who developed SRP target driver 
and now selling based on it product.

Also, there is a target driver development for Marvell SAS hardware by 
an anonymous company, see 
http://sourceforge.net/mailarchive/message.php?msg_id=e938503f0809260211r2d4ec37bt293c75c80960eadd%40mail.gmail.com

If you need more, I'll ask permissions from companies who already 
selling SCST based products (BTW, 2 of them - user space VTLs, which can 
be made on STGT, but those companies chose SCST).

It's worth to note here, that scst-devel mailing list has 134 
subscribers. Many of them are from well known storage related companies. 
Unfortunately, other sf.net statistics permanently loose data, hence not 
trustworthy, so I can't refer to it.

> So stop fighting ... you're not going to backstab your way to inclusion.
> 
> The only identified failing of STGT (and it's theoretical, not
> demonstrated, although I can agree the theory looks correct) is that the
> user space packet processing may cause performance problems on high
> speed networks.  We know from practical tests that these networks have
> to be above 1Gbit because the results were identical for STGT and SCST
> on a 1G network, so it's infiniband or 10Gbit ethernet.

I thought that SRP measurements in http://lkml.org/lkml/2008/12/10/245 
are sufficient to remove all your doubts. If you don't object, I'll 
remind: there was a >50% improvement in IOPS on 4K writes (~150K vs 
~100K), which relates to >200MB/s throughput increase, when, where 
possible, processing was moved from kernel threads to tasklets. For STGT 
any processing can't be moved to tasklets by design and context switches 
between user space threads are a bit heavier, than between kernel 
threads, + STGT has some syscall entry/exit overheads, hence for the 
same processing done in STGT, the difference would be even more.

Thus, those measurements give the low boundary estimation of the 
performance increase. Having such a huge increase on 4K block sizes is a 
big advantage for any latency bound applications, like databases.

What else should we do to convince you?

Also, what I can't understand, why you don't want to count the 
architectural advantages of SCST over STGT. Namely: overall simplicity, 
possibility to implement many impossible for STGT features, like 
complete pass-through and zero-copy cache IO. In fact, one such feature 
has already been implemented: zero-copy transmit in iSCSI target. From 
user space this is impossible, but for kernel I implemented it by very 
small and simple patch.

> So, what it comes down to is that if we had a kernel side protocol
> accelerator for STGT, the project would no longer suffer from this
> theoretical failing.  *Both* of you have such a thing embedded in your
> respective submissions (all 74k LOC of them) so can't you just enhance
> STGT with whichever one is better ... actually, if you'd both bury the
> hatchet and work on the enhancement together taking the best of each
> project, we'd have something that worked much better and a unified user
> base and neither side would be able to claim sole credit ... just a
> thought.

James, just think as if SCST in the current state is STGT in which all 
the possible enhancements are already incorporated. It simply has been 
cooking outside of the kernel for too long, so you didn't see the 
intermediate steps. I'm not joking. I'm absolutely serious. And it is 
true. Developing scst_user module I carefully studied STGT and scst_user 
has everything it could take from it.

When you ask us to improve STGT step by step and implement a kernel side 
protocol accelerator for it, you ask us to go back by 2+ years. For the 
kernel side acceleration STGT needs to move the SCSI target state 
machine and memory management into the kernel, which effectively means 
to convert it to SCST. What should I do to make it clear for you?

Also, current integration of STGT with Linux (initiator) SCSI subsystem 
should have a better design, I explained why in 
http://lkml.org/lkml/2008/12/10/245. SCSI initiator and target has 
almost nothing to share, so they should be separated.

I always open for any possible cooperation. Particularly, I'm always 
willing to make with SCST any necessary changes, which will lead to 
better target engine in Linux. But before doing any change I, as any 
sane engineer, need to have answers on several simple questions. 
Basically, there are 2 such questions:

1. For what the proposed action is needed? I.e., which real life task is 
it going to solve?

2. Why is the proposed change the best one among possible implementation 
alternatives?

If you simply take from 
http://scst.sourceforge.net/patches/scst_combined.patch the combined 
SCST patch, which has all 23 patches I submitted combined in a single 
file (BTW, it has 46K LOC, not 76K), then patch some 2.6.27 tree and 
spend a little time looking at it, you will soon find out that 
converting STGT to SCST is the worst possible alternative. Simply try to 
find out places, where STGT in-kernel core is better, than SCST core, or 
has a feature, which SCST core doesn't have. There is only one such 
feature: OSD support, i.e. bidirectional transfers, large CDBs, etc. It 
wasn't implemented in SCST so far, because there was no demand for it 
(hence, no way to test). But (1) this feature doesn't have any in-kernel 
user, so nobody will be affected if STGT moved to be user space only, 
and (2) there is nothing hard to add that feature to SCST, if there is 
such demand.

I have been closely following development of both STGT and LIO since 
their beginning, so my words based on close examination of their source 
code, not on my rejection to look at it. They both inferior to SCST in 
all main areas. I believe, there is no point to spend time improving 
kernel side of STGT. Better to put effort to better integrate user space 
part of STGT with scst_local SCST module as I described in 
http://lkml.org/lkml/2008/12/10/245. If you don't agree with me, can you 
answer on the question (2) above, please?

 From everything I know SCST at the moment is the best open source SCSI 
target engine in the world and no other target engines, including 
Solaris's COMSTAR, can match it in functionality, performance and 
stability areas.

James, you offered by already *completed* work, where everything 
possible to improve STGT was already done, so why not simply accept it?

I'm an engineer, not a sales man, and there are no sales men in SCST 
team to advertise it. We believe that the source code, its quality, 
performance and feature completeness should speak theirself. It has been 
in Linux so far and we hope will be so in this case. Just let the code 
speak!

Sorry for taking your time by one more huge e-mail. I did my best to be 
as laconic as possible.

Thanks,
Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/