linux-kernel - Re: [PATCH 1/5 v2] blk-mq: Add prep/unprep support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150418201610.GB20311@infradead.org>
Date:	Sat, 18 Apr 2015 13:16:10 -0700
From:	Christoph Hellwig <hch@...radead.org>
To:	Matias Bjorling <m@...rling.me>
Cc:	Christoph Hellwig <hch@...radead.org>, keith.busch@...el.com,
	javier@...etta.io, linux-kernel@...r.kernel.org,
	linux-nvme@...ts.infradead.org, axboe@...com,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 1/5 v2] blk-mq: Add prep/unprep support

On Sat, Apr 18, 2015 at 08:45:19AM +0200, Matias Bjorling wrote:
> The low level drivers will be NVMe and vendor's own PCI-e drivers. It's very
> generic in their nature. Each driver would duplicate the same work. Both
> could have normal and open-channel drives attached.

I didn't say the work should move into the driver, bur rather that
driver should talk to the open channel ssd code directly instead of
hooking into the core block code.

> I'll like to keep blk-mq in the loop. I don't think it will be pretty to
> have two data paths in the drivers. For blk-mq, bios are splitted/merged on
> the way down. Thus, the actual physical addresses needs aren't known before
> the IO is diced to the right size.

But you _do_ have two different data path already.  Nothing says you
can't use blk-mq for your data path, ut it should be a separate entry
point.  Similar to say how a SCSI disk and MMC device both use the block
layer but still use different entry points.

> The reason it shouldn't be under the a single block device, is that a target
> should be able to provide a global address space.
> That allows the address
> space to grow/shrink dynamically with the disks. Allowing a continuously
> growing address space, where disks can be added/removed as requirements grow
> or flash ages. Not on a sector level, but on a flash block level.

I don't understand what you mean with a single block device here, but I
suspect we're talking past each other somehow.

> >>In the future, applications can have an API to get/put flash block directly.
> >>(using the blk_nvm_[get/put]_blk interface).
> >
> >s/application/filesystem/?
> >
> 
> Applications. The goal is that key value stores, e.g. RocksDB, Aerospike,
> Ceph and similar have direct access to flash storage. There won't be a
> kernel file-system between.
> 
> The get/put interface can be seen as a space reservation interface for where
> a given process is allowed to access the storage media.
> 
> It can also be seen in the way that we provide a block allocator in the
> kernel, while applications implement the rest of "file-system" in
> user-space, specially optimized for their data structures. This makes a lot
> of sense for a small subset (LSM, Fractal trees, etc.) of database
> applications.

While we'll need a proper API for that first it's just another reason of
why we shouldnt shoe horn the open channel ssd support into the block
layer.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/