linux-kernel - Re: [PATCH v20 12/12] null_blk: add support for copy offload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240521144629.reyeiktaj72p4lzd@green245>
Date: Tue, 21 May 2024 20:16:29 +0530
From: Nitesh Shetty <nj.shetty@...sung.com>
To: Bart Van Assche <bvanassche@....org>
Cc: Jens Axboe <axboe@...nel.dk>, Jonathan Corbet <corbet@....net>, Alasdair
	Kergon <agk@...hat.com>, Mike Snitzer <snitzer@...nel.org>, Mikulas Patocka
	<mpatocka@...hat.com>, Keith Busch <kbusch@...nel.org>, Christoph Hellwig
	<hch@....de>, Sagi Grimberg <sagi@...mberg.me>, Chaitanya Kulkarni
	<kch@...dia.com>, Alexander Viro <viro@...iv.linux.org.uk>, Christian
	Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
	martin.petersen@...cle.com, david@...morbit.com, hare@...e.de,
	damien.lemoal@...nsource.wdc.com, anuj20.g@...sung.com, joshi.k@...sung.com,
	nitheshshetty@...il.com, gost.dev@...sung.com, Vincent Fu
	<vincent.fu@...sung.com>, linux-block@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
	dm-devel@...ts.linux.dev, linux-nvme@...ts.infradead.org,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v20 12/12] null_blk: add support for copy offload

On 20/05/24 04:42PM, Bart Van Assche wrote:
>On 5/20/24 03:20, Nitesh Shetty wrote:
>>+	if (blk_rq_nr_phys_segments(req) != BLK_COPY_MAX_SEGMENTS)
>>+		return status;
>
>Why is this check necessary?
>
>>+	/*
>>+	 * First bio contains information about destination and last bio
>>+	 * contains information about source.
>>+	 */
>
>Please check this at runtime (WARN_ON_ONCE()?).
>
>>+	__rq_for_each_bio(bio, req) {
>>+		if (seg == blk_rq_nr_phys_segments(req)) {
>>+			sector_in = bio->bi_iter.bi_sector;
>>+			if (rem != bio->bi_iter.bi_size)
>>+				return status;
>>+		} else {
>>+			sector_out = bio->bi_iter.bi_sector;
>>+			rem = bio->bi_iter.bi_size;
>>+		}
>>+		seg++;
>>+	}
>
>_rq_for_each_bio() iterates over the bios in a request. Does a copy
>offload request always have two bios - one copy destination bio and
>one copy source bio? If so, is 'seg' a bio counter? Why is that bio
>counter compared with the number of physical segments in the request?
>
Yes, your observation is right. We are treating first bio as dst and
second as src. If not for that comparision, we might need to store the
index in a temporary variable and parse based on index value.

>>+	trace_nullb_copy_op(req, sector_out << SECTOR_SHIFT,
>>+			    sector_in << SECTOR_SHIFT, rem);
>>+
>>+	spin_lock_irq(&nullb->lock);
>>+	while (rem > 0) {
>>+		chunk = min_t(size_t, nullb->dev->blocksize, rem);
>>+		offset_in = (sector_in & SECTOR_MASK) << SECTOR_SHIFT;
>>+		offset_out = (sector_out & SECTOR_MASK) << SECTOR_SHIFT;
>>+
>>+		if (null_cache_active(nullb) && !is_fua)
>>+			null_make_cache_space(nullb, PAGE_SIZE);
>>+
>>+		t_page_in = null_lookup_page(nullb, sector_in, false,
>>+					     !null_cache_active(nullb));
>>+		if (!t_page_in)
>>+			goto err;
>>+		t_page_out = null_insert_page(nullb, sector_out,
>>+					      !null_cache_active(nullb) ||
>>+					      is_fua);
>>+		if (!t_page_out)
>>+			goto err;
>>+
>>+		in = kmap_local_page(t_page_in->page);
>>+		out = kmap_local_page(t_page_out->page);
>>+
>>+		memcpy(out + offset_out, in + offset_in, chunk);
>>+		kunmap_local(out);
>>+		kunmap_local(in);
>>+		__set_bit(sector_out & SECTOR_MASK, t_page_out->bitmap);
>>+
>>+		if (is_fua)
>>+			null_free_sector(nullb, sector_out, true);
>>+
>>+		rem -= chunk;
>>+		sector_in += chunk >> SECTOR_SHIFT;
>>+		sector_out += chunk >> SECTOR_SHIFT;
>>+	}
>>+
>>+	status = 0;
>>+err:
>>+	spin_unlock_irq(&nullb->lock);
>
>In the worst case, how long does this loop disable interrupts?
>
We havn't measured this. But this should be similar to read and write in
present infra, as we followed similar approach.

>>+TRACE_EVENT(nullb_copy_op,
>>+		TP_PROTO(struct request *req,
>>+			 sector_t dst, sector_t src, size_t len),
>>+		TP_ARGS(req, dst, src, len),
>>+		TP_STRUCT__entry(
>>+				 __array(char, disk, DISK_NAME_LEN)
>>+				 __field(enum req_op, op)
>>+				 __field(sector_t, dst)
>>+				 __field(sector_t, src)
>>+				 __field(size_t, len)
>>+		),
>
>Isn't __string() preferred over __array() since the former occupies less space
>in the trace buffer?
>
Again we followed the present existing implementation, to have a simpler
series to review.

Thank you,
Nitesh Shetty