linux-kernel - Re: Integration of SCST in the mainstream Linux kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <47A7986B.1070206@garzik.org>
Date:	Mon, 04 Feb 2008 17:57:47 -0500
From:	Jeff Garzik <jeff@...zik.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	"J. Bruce Fields" <bfields@...ldses.org>,
	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	James Bottomley <James.Bottomley@...senPartnership.com>,
	Vladislav Bolkhovitin <vst@...b.net>,
	Bart Van Assche <bart.vanassche@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	linux-scsi@...r.kernel.org, scst-devel@...ts.sourceforge.net,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Mike Christie <michaelc@...wisc.edu>
Subject: Re: Integration of SCST in the mainstream Linux kernel

Linus Torvalds wrote:
> So no, performance is not the only reason to move to kernel space. It can 
> easily be things like needing direct access to internal data queues (for a 
> iSCSI target, this could be things like barriers or just tagged commands - 
> yes, you can probably emulate things like that without access to the 
> actual IO queues, but are you sure the semantics will be entirely right?
> 
> The kernel/userland boundary is not just a performance boundary, it's an 
> abstraction boundary too, and these kinds of protocols tend to break 
> abstractions. NFS broke it by having "file handles" (which is not 
> something that really exists in user space, and is almost impossible to 
> emulate correctly), and I bet the same thing happens when emulating a SCSI 
> target in user space.

Well, speaking as a complete nutter who just finished the bare bones of 
an NFSv4 userland server[1]...  it depends on your approach.

If the userland server is the _only_ one accessing the data[2] -- i.e. 
the database server model where ls(1) shows a couple multi-gigabyte 
files or a raw partition -- then it's easy to get all the semantics 
right, including file handles.  You're not racing with local kernel 
fileserving.

Couple that with sendfile(2), sync_file_range(2) and a few other 
Linux-specific syscalls, and you've got an efficient NFS file server.

It becomes a solution similar to Apache or MySQL or Oracle.

I quite grant there are many good reasons to do NFS or iSCSI data path 
in the kernel...  my point is more that "impossible" is just from one 
point of view ;-)

> Maybe not. I _rally_ haven't looked into iSCSI, I'm just guessing there 
> would be things like ordering issues.

iSCSI and NBD were passe ideas at birth.  :)

Networked block devices are attractive because the concepts and 
implementation are more simple than networked filesystems... but usually 
you want to run some sort of filesystem on top.  At that point you might 
as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your 
networked block device (and associated complexity).

iSCSI is barely useful, because at least someone finally standardized 
SCSI over LAN/WAN.

But you just don't need its complexity if your filesystem must have its 
own authentication, distributed coordination, multiple-connection 
management code of its own.

	Jeff

P.S.  Clearly my NFSv4 server is NOT intended to replace the kernel one. 
  It's more for experiments, and doing FUSE-like filesystem work.

[1] http://linux.yyz.us/projects/nfsv4.html

[2] well, outside of dd(1) and similar tricks... the same "going around 
its back" tricks that can screw up a mounted filesystem.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/