linux-kernel - Re: dmaengine/Query: What about scatter/gather for mem to mem transfers.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 20 Dec 2011 15:45:08 +0530
From:	Pratyush Anand <pratyush.anand@...com>
To:	Vinod Koul <vinod.koul@...el.com>
Cc:	"Williams, Dan J" <dan.j.williams@...el.com>,
	Viresh KUMAR <viresh.kumar@...com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Shiraz HASHIM <shiraz.hashim@...com>,
	Armando VISCONTI <armando.visconti@...com>,
	Deepak SIKRI <deepak.sikri@...com>,
	Vipin KUMAR <vipin.kumar@...com>,
	Vipul Kumar SAMAR <vipulkumar.samar@...com>,
	Vincenzo FRASCINO <Vincenzo.FRASCINO@...com>,
	Mirko GARDI <mirko.gardi@...com>,
	Rajeev KUMAR <rajeev-dlh.kumar@...com>,
	Amit VIRDI <Amit.VIRDI@...com>,
	Bhupesh SHARMA <bhupesh.sharma@...com>,
	"linus.walleij@...aro.org" <linus.walleij@...aro.org>
Subject: Re: dmaengine/Query: What about scatter/gather for mem to mem transfers.

On 12/20/2011 2:51 PM, Vinod Koul wrote:
> On Thu, 2011-12-15 at 12:26 +0530, Pratyush Anand wrote:
>>>> That way existing mechanism would work well for you.
>>>> You need to split the chunks properly, which is what dma would do
>> anyway
>>>>
>>>
>>> Yes, they can be split like this, but then splitting onus will go on
>> dma
>>> user driver, and so there would be replication of similar logic at
>>> several places. Therefore, I was thinking to make device_prep_dma_sg
>> as
>>> generic by adding these flags.
> Well I am not sure how adding flags handles this?

device_prep_dma_sg has last argument as flags. dma user driver can pass 
information about inc/dec of src/dst in this flag, which can be used by 
the dma driver. I have put one such implementation at the end of mail.

> There are few things you should consider
> 1) do you have h/w support for these, if yes then we can talk about
> dmaengine APIs doing such a thing

No, there is not any specific HW option.

> 2) if objective is to support such transfers from dma driver POV, then I
> wouldn't agree, as these can be split easily to standard dma sg list.
>> From the code POV, it wouldn't hurt to create a wrapper which take in
> these non standard sg list list and converts then to uniform list

Yes, a wrapper function can do that.

>
> Most important question:
> in which practical scenario would src and dstn lengths be different?

Let me explain my sceberio.

Actual memory has been allocated by malloc in user space and pinned 
using mlock. Now src to dest copy (memcpy) has to be implemented using 
dma. A kernel module takes these addresses and transfer length as input. 
It extracts physical address each page by page. Now, it may happen that 
src address is like page 1-5, 7, 9 while dst address is like page 20-26. 
So here, for src there would have 3 nodes in sg list while for dest 
number of nodes would be 1. One option could be, as you has suggested 
earlier to equalize src and dst addresses gap and length by the dma user 
driver itself and then call device_prep_interleaved_dma. Other option 
what I think,  could be to pass these LLIs as it is and then manage it 
in device_prep_dma_sg using flags.

>
>>
>> I see one more issue in using device_prep_interleaved_dma.
>>
>> Src and Dst address has been allocated in user space.
>> Now a kernel module extracts physical addresses from these pages and
>> prepares a sg list, which it submits to DMA.
>> These addresses would be virtually contiguous and incrementing. But,
>> I
>> am not sure if they are always physically incrementing too. If they
>> are
>> not guaranteed to be incrementing, then I see issue.
>>
>> Otherwise also, a situation can arise when scattered memory is not
>> always incrementing or decrementing in the same sg list.
> What _exactly_ are you trying to do?
>
> DMA would need buffers which are physically contagious. Also the user
> pages can be swapped out, you would need to pin these pages.
>

but, even after using mlock, will it insure that allocated pages are 
always physically incrementing. If not then , probably we can not use 
device_prep_interleaved_dma.

Regards
Pratyush

Code which I referred:

+dwc_prep_dma_sg(struct dma_chan *chan,
+		struct scatterlist *dst_sg, unsigned int dst_nents,
+		struct scatterlist *src_sg, unsigned int src_nents,
+		unsigned long flags)
+{
+	struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
+	struct dw_desc *first = NULL, *desc, *prev;
+	unsigned int offset, xfer_count, xfer_bytes;
+	unsigned int best_width, width, misalignment;
+	u32 ctllo;
+	dma_addr_t src, dst;
+	unsigned int slen = 0, dlen = 0, len = 0, total_len = 0;
+	unsigned int si = 0, di = 0;
+	unsigned long def_flags = 0;
+
+	if (unlikely(src_nents == 0 || dst_nents == 0))
+		return NULL;
+
+	/*
+	 * TODO: Currently We only support src and dst both as either
+	 * incrementing or decrementing.
+	 */
+	if ((flags & DMA_SRC_INC) && !(flags & DMA_DST_INC))
+		return NULL;
+
+	if ((flags & DMA_SRC_DEC) && !(flags & DMA_DST_DEC))
+		return NULL;
+
+	if (flags & DMA_SRC_INC)
+		def_flags |= DWC_CTLL_SRC_INC;
+	else if (flags & DMA_SRC_DEC)
+		def_flags |= DWC_CTLL_SRC_DEC;
+
+	if (flags & DMA_DST_INC)
+		def_flags |= DWC_CTLL_DST_INC;
+	else if (flags & DMA_DST_DEC)
+		def_flags |= DWC_CTLL_DST_DEC;
+
+	/* silence gcc warnings */
+	src = dst = 0;
+	prev = NULL;
+
+	while (si < src_nents || di < dst_nents) {
+		if (slen == 0) {
+			si++;
+			src = sg_dma_address(src_sg);
+			slen = sg_dma_len(src_sg);
+			/* If decrementing then start from end address */
+			if (flags & DMA_SRC_DEC)
+				src = src + slen;
+			src_sg = sg_next(src_sg);
+			if (unlikely (si < src_nents && src_sg == NULL)) {
+				dev_printk(KERN_ERR, chan2dev(chan),
+					"Invalid src scattergather list: "
+					"entry %u of %u has last flag\n",
+					si, src_nents);
+				goto err_sg_len;
+			} else if (unlikely (si == src_nents
+						&& src_sg != NULL)) {
+				dev_printk(KERN_ERR, chan2dev(chan),
+					"Invalid src scattergather list: "
+					"entry %u of %u has no last flag\n",
+					si, src_nents);
+				goto err_sg_len;
+			}
+		} else {
+			if (flags & DMA_SRC_INC)
+				src += len;
+			else if (flags & DMA_SRC_DEC)
+				src -= len;
+			else
+				goto err_sg_len;
+		}
+
+		if (dlen == 0) {
+			di++;
+			dst = sg_dma_address(dst_sg);
+			dlen = sg_dma_len(dst_sg);
+			/* If decrementing then start from end address */
+			if (flags & DMA_DST_DEC)
+				dst = dst + dlen;
+			dst_sg = sg_next(dst_sg);
+			if (unlikely (di < dst_nents && dst_sg == NULL)) {
+				dev_printk(KERN_ERR, chan2dev(chan),
+					"Invalid dst scattergather list: "
+					"entry %u of %u has last flag\n",
+					di, dst_nents);
+				goto err_sg_len;
+			} else if (unlikely (di == dst_nents
+						&& dst_sg != NULL)) {
+				dev_printk(KERN_ERR, chan2dev(chan),
+					"Invalid dst scattergather list: "
+					"entry %u of %u has no last flag\n",
+					di, dst_nents);
+				goto err_sg_len;
+			}
+		} else {
+			if (flags & DMA_DST_INC)
+				dst += len;
+			else if (flags & DMA_DST_DEC)
+				dst -= len;
+			else
+				goto err_sg_len;
+		}
+
+		len = min(slen, dlen);
+		slen -= len;
+		dlen -= len;
+
+		/* src-dst relative misalignment */
+		misalignment = src ^ dst;
+		best_width = __ffs(DWC_MAX_WIDTH | misalignment);
+
+		/* We want the transfer to be performed in 3 parts:
+		 *
+		 * - align src and dest to the best possible alignment
+		 *   (up to the maximum width of 8)
+		 *
+		 * - perform most of the transfer using the most
+		 *   efficient alignment
+		 *
+		 * - complete the transfer using whatever alignment is
+		 *   needed to total len bytes
+		 */
+
+		/* First part of the transfer: align src and dst*/
+
+		/* misalignment of src and dst with respect to 0 */
+		misalignment = src | dst;
+		width = __ffs(DWC_MAX_WIDTH | misalignment);
+
+		/* If we're already aligned, the first part is a
+		 * no-op, skip to the second */
+		if (width != best_width
+				&& len >> best_width > __ffs(DWC_MAX_WIDTH)) {
+			/* align src and dst to (1 << best_width) */
+			unsigned int align = 1 << best_width;
+			xfer_bytes = (align-src) & (align-dst) & (align-1);
+		} else
+			xfer_bytes = len;
+
+		offset = 0;
+		xfer_count = xfer_bytes >> width;
+
+		if (flags & (DMA_SRC_DEC | DMA_DST_DEC)){
+			src -= (1 << width);
+			dst -= (1 << width);
+		}
+
+		do {
+			ctllo = DWC_DEFAULT_CTLLO(chan->private)
+				| DWC_CTLL_DST_WIDTH(width)
+				| DWC_CTLL_SRC_WIDTH(width)
+				| def_flags
+				| DWC_CTLL_FC_M2M;
+
+			while (xfer_count > 0) {
+				xfer_count = min_t(size_t, xfer_count,
+						DWC_MAX_COUNT);
+
+				desc = dwc_desc_get(dwc);
+				if (unlikely (desc == NULL))
+					goto err_desc_get;
+
+				if (flags & (DMA_SRC_INC | DMA_DST_INC)){
+					desc->lli.sar = src + offset;
+					desc->lli.dar = dst + offset;
+				} else {
+					desc->lli.sar = src - offset;
+					desc->lli.dar = dst - offset;
+				}
+				desc->lli.ctllo = ctllo;
+				desc->lli.ctlhi = xfer_count;
+
+				if (first == NULL) {
+					first = desc;
+				} else {
+					prev->lli.llp = desc->txd.phys;
+					dma_sync_single_for_device(
+							chan2parent(chan),
+							prev->txd.phys,
+							sizeof(prev->lli),
+							DMA_TO_DEVICE);
+					list_add_tail(&desc->desc_node,
+							&first->tx_list);
+				}
+
+				prev = desc;
+
+				offset += xfer_count << width;
+				xfer_bytes -= xfer_count << width;
+
+				xfer_count = xfer_bytes >> width;
+			}
+
+			xfer_bytes = len - offset;
+
+			/*
+			 * The first part has been done. If there is
+			 * some data to be transferred using
+			 * best_width-aligned operations, do
+			 * it. Otherwise, choose an alignment that is
+			 * works for all of the remaining transfer.
+			 */
+			if (xfer_bytes >> best_width > 0) {
+				width = best_width;
+			} else {
+				/* misalignment of the remaining part
+				 * of src, dst and len */
+				misalignment = (src+offset) | (dst+offset)
+					| xfer_bytes;
+				width = __ffs(DWC_MAX_WIDTH | misalignment);
+			}
+
+			xfer_count = xfer_bytes >> width;
+		} while (offset < len);
+
+		total_len += len;
+		if (flags & (DMA_SRC_DEC | DMA_DST_DEC)){
+			src += (1 << width);
+			dst += (1 << width);
+		}
+
+	}
+
+	if (flags & DMA_PREP_INTERRUPT)
+		/* Trigger interrupt after last block */
+		prev->lli.ctllo |= DWC_CTLL_INT_EN;
+
+	prev->lli.llp = 0;
+	dma_sync_single_for_device(chan2parent(chan),
+				prev->txd.phys, sizeof(prev->lli),
+				DMA_TO_DEVICE);
+
+	first->txd.flags = flags;
+	first->len = total_len;
+
+	return &first->txd;
+
+err_sg_len:
+err_desc_get:
+	dwc_desc_put(dwc, first);
+	return NULL;
+}
+
+static struct dma_async_tx_descriptor *
  dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
  		unsigned int sg_len, enum dma_data_direction direction,
  		unsigned long flags)
@@ -1474,6 +1728,7 @@ static int __init dw_probe(struct platform_device 
*pdev)
  	dw->dma.device_free_chan_resources = dwc_free_chan_resources;

  	dw->dma.device_prep_dma_memcpy = dwc_prep_dma_memcpy;
+	dw->dma.device_prep_dma_sg = dwc_prep_dma_sg;

  	dw->dma.device_prep_slave_sg = dwc_prep_slave_sg;
  	dw->dma.device_control = dwc_control;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/