[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50b07a065455b93b78ee43ba665083ee@walle.cc>
Date: Fri, 07 May 2021 20:14:34 +0200
From: Michael Walle <michael@...le.cc>
To: Pratyush Yadav <p.yadav@...com>
Cc: Tudor Ambarus <tudor.ambarus@...rochip.com>,
Miquel Raynal <miquel.raynal@...tlin.com>,
Richard Weinberger <richard@....at>,
Vignesh Raghavendra <vigneshr@...com>,
Mark Brown <broonie@...nel.org>, linux-mtd@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-spi@...r.kernel.org
Subject: Re: [PATCH 5/6] mtd: spi-nor: core; avoid odd length/address reads on
8D-8D-8D mode
Am 2021-05-07 20:04, schrieb Pratyush Yadav:
> On 07/05/21 05:51PM, Michael Walle wrote:
>> Am 2021-05-06 21:18, schrieb Pratyush Yadav:
>> > On Octal DTR capable flashes like Micron Xcella reads cannot start or
>> > end at an odd address in Octal DTR mode. Extra bytes need to be read at
>> > the start or end to make sure both the start address and length remain
>> > even.
>> >
>> > To avoid allocating too much extra memory, thereby putting unnecessary
>> > memory pressure on the system, the temporary buffer containing the extra
>> > padding bytes is capped at PAGE_SIZE bytes. The rest of the 2-byte
>> > aligned part should be read directly in the main buffer.
>> >
>> > Signed-off-by: Pratyush Yadav <p.yadav@...com>
>> > ---
>> >
>> > drivers/mtd/spi-nor/core.c | 81 +++++++++++++++++++++++++++++++++++++-
>> > 1 file changed, 80 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
>> > index 5cc206b8bbf3..3d66cc34af4d 100644
>> > --- a/drivers/mtd/spi-nor/core.c
>> > +++ b/drivers/mtd/spi-nor/core.c
>> > @@ -1904,6 +1904,82 @@ static const struct flash_info
>> > *spi_nor_read_id(struct spi_nor *nor)
>> > return ERR_PTR(-ENODEV);
>> > }
>> >
>> > +/*
>> > + * On Octal DTR capable flashes like Micron Xcella reads cannot start
>> > or
>> > + * end at an odd address in Octal DTR mode. Extra bytes need to be read
>> > + * at the start or end to make sure both the start address and length
>> > + * remain even.
>> > + */
>> > +static int spi_nor_octal_dtr_read(struct spi_nor *nor, loff_t from,
>> > size_t len,
>> > + u_char *buf)
>> > +{
>> > + u_char *tmp_buf;
>> > + size_t tmp_len;
>> > + loff_t start, end;
>> > + int ret, bytes_read;
>> > +
>> > + if (IS_ALIGNED(from, 2) && IS_ALIGNED(len, 2))
>> > + return spi_nor_read_data(nor, from, len, buf);
>> > + else if (IS_ALIGNED(from, 2) && len > PAGE_SIZE)
>> > + return spi_nor_read_data(nor, from, round_down(len, PAGE_SIZE),
>> > + buf);
>> > +
>> > + tmp_buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
>> > + if (!tmp_buf)
>> > + return -ENOMEM;
>> > +
>> > + start = round_down(from, 2);
>> > + end = round_up(from + len, 2);
>> > +
>> > + /*
>> > + * Avoid allocating too much memory. The requested read length might
>> > be
>> > + * quite large. Allocating a buffer just as large (slightly bigger, in
>> > + * fact) would put unnecessary memory pressure on the system.
>> > + *
>> > + * For example if the read is from 3 to 1M, then this will read from 2
>> > + * to 4098. The reads from 4098 to 1M will then not need a temporary
>> > + * buffer so they can proceed as normal.
>> > + */
>> > + tmp_len = min_t(size_t, end - start, PAGE_SIZE);
>> > +
>> > + ret = spi_nor_read_data(nor, start, tmp_len, tmp_buf);
>> > + if (ret == 0) {
>> > + ret = -EIO;
>> > + goto out;
>> > + }
>> > + if (ret < 0)
>> > + goto out;
>> > +
>> > + /*
>> > + * More bytes are read than actually requested, but that number can't
>> > be
>> > + * reported to the calling function or it will confuse its
>> > calculations.
>> > + * Calculate how many of the _requested_ bytes were read.
>> > + */
>> > + bytes_read = ret;
>> > +
>> > + if (from != start)
>> > + ret -= from - start;
>> > +
>> > + /*
>> > + * Only account for extra bytes at the end if they were actually read.
>> > + * For example, if the total length was truncated because of temporary
>> > + * buffer size limit then the adjustment for the extra bytes at the
>> > end
>> > + * is not needed.
>> > + */
>> > + if (start + bytes_read == end)
>> > + ret -= end - (from + len);
>> > +
>> > + if (ret < 0) {
>> > + ret = -EIO;
>> > + goto out;
>> > + }
>> > +
>> > + memcpy(buf, tmp_buf + (from - start), ret);
>> > +out:
>> > + kfree(tmp_buf);
>> > + return ret;
>> > +}
>> > +
>> > static int spi_nor_read(struct mtd_info *mtd, loff_t from, size_t len,
>> > size_t *retlen, u_char *buf)
>> > {
>> > @@ -1921,7 +1997,10 @@ static int spi_nor_read(struct mtd_info *mtd,
>> > loff_t from, size_t len,
>> >
>> > addr = spi_nor_convert_addr(nor, addr);
>> >
>> > - ret = spi_nor_read_data(nor, addr, len, buf);
>> > + if (nor->read_proto == SNOR_PROTO_8_8_8_DTR)
>> > + ret = spi_nor_octal_dtr_read(nor, addr, len, buf);
>> > + else
>> > + ret = spi_nor_read_data(nor, addr, len, buf);
>> > if (ret == 0) {
>> > /* We shouldn't see 0-length reads */
>> > ret = -EIO;
>>
>> Reviewed-by: Michael Walle <michael@...le.cc>
>
> Thanks.
>
>>
>> I wonder how much performance is lost if this would just split
>> one transfer into up to three ones: 2 byte, size - 2, 2 bytes.
>
> This case is not really possible since it would try to read PAGE_SIZE
> whenever it can. But there is a situation possible where one transfer
> is
> split into three. It would look something like: 4096 bytes, size - 4096
> bytes, 2 bytes.
Ah no, I wasn't talking about your implementation, but just having a
naive
one where you don't move around up to PAGE_SIZE of data but just read
2 bytes in the beginning (if unaligned) and 2 bytes at the end (if
unaligned)
and reading the part in between just as usual because its then aligend.
> I am trying to find a balance between minimizing number of reads while
> keeping the size of the temporary buffer to a reasonable limit. This is
> the best I could come up with. It optimizes for smaller transfers so
> while the absolute amount of overhead remains roughly the same, the
> ratio of it relative to read size is smaller.
Yes, with this you will have that memcpy() and one transfer for
transfers
up to PAGE_SIZE; the "naive" one above would have up to three depending
on
the aligment.
> You can optimize for read performance if you are willing to waste
> memory
> by simple allocating a size + 2 bytes long buffer. Then the read can
> proceed in one transaction. But IMO memory is much more important
> compared to read throughput.
-michael
Powered by blists - more mailing lists