[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be8291fc-8e69-b577-d8f4-20aeca0b45cc@nvidia.com>
Date: Thu, 3 Dec 2020 16:22:54 -0800
From: Sowjanya Komatineni <skomatineni@...dia.com>
To: Mark Brown <broonie@...nel.org>
CC: <thierry.reding@...il.com>, <jonathanh@...dia.com>,
<robh+dt@...nel.org>, <linux-spi@...r.kernel.org>,
<linux-tegra@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<devicetree@...r.kernel.org>
Subject: Re: [PATCH v1 3/7] spi: qspi-tegra: Add support for Tegra210 QSPI
controller
On 12/2/20 11:17 AM, Sowjanya Komatineni wrote:
>
> On 12/2/20 9:27 AM, Mark Brown wrote:
>> On Tue, Dec 01, 2020 at 01:12:44PM -0800, Sowjanya Komatineni wrote:
>>> Tegra SoC has a Quad SPI controller starting from Tegra210.
>>>
>>> This patch adds support for Tegra210 QSPI controller.
>> This looks pretty clean but I've got a few questions below about how
>> this integrates with the frameworks as well as some more minor issues.
>>
>>> +config QSPI_TEGRA
>>> + tristate "Nvidia Tegra QSPI Controller"
>> Everything else in this file is SPI_, even the qspi controllers.
> Will rename in v2
>>> +++ b/drivers/spi/qspi-tegra.c
>>> @@ -0,0 +1,1418 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/*
>>> + * Copyright (C) 2020 NVIDIA CORPORATION. All rights reserved.
>>> + */
>> Please make the entire comment a C++ one. It also appears that the "All
>> rights reserved" here conflicts with the GPL-2.0-only SPDX statement...
> Will fix in v2
>>
>>> +static void
>>> +tegra_qspi_copy_client_txbuf_to_qspi_txbuf(struct tegra_qspi_data
>>> *tqspi,
>>> + struct spi_transfer *t)
>>> +{
>>> + /* Make the dma buffer to read by cpu */
>>> + dma_sync_single_for_cpu(tqspi->dev, tqspi->tx_dma_phys,
>>> + tqspi->dma_buf_size, DMA_TO_DEVICE);
>>> +
>>> + if (tqspi->is_packed) {
>>> + unsigned int len = tqspi->curr_dma_words *
>>> + tqspi->bytes_per_word;
>>> +
>>> + memcpy(tqspi->tx_dma_buf, t->tx_buf + tqspi->cur_pos, len);
>>> + tqspi->cur_tx_pos += tqspi->curr_dma_words *
>>> + tqspi->bytes_per_word;
>> It seems weird that this device needs us to do a memcpy() to do DMA,
>> most devices are able to DMA directly from the buffers provided by the
>> SPI API (and let the SPI core sync things). What is going on here?
>
> For transfers of size more than max DMA transfer limit, data transfer
> happens in multiple iterations with each iteration transferring up to
> max DMA transfer limit.
>
> So using separate dma buffers and on every iteration copying them to
> SPI core provided tx/rx buffers.
Also unpack mode needs to manually put the bytes together from read data
to SPI core rx buffer.
>
> Transferring data logic in this driver is similar as Tegra SPI driver
> except register changes and some QSPI specific register programming.
>
>>
>>> + tegra_qspi_writel(tqspi, status, QSPI_FIFO_STATUS);
>>> + while ((status & QSPI_FIFO_EMPTY) != QSPI_FIFO_EMPTY) {
>>> + status = tegra_qspi_readl(tqspi, QSPI_FIFO_STATUS);
>>> + if (time_after(jiffies, timeout)) {
>>> + dev_err(tqspi->dev,
>>> + "timeout waiting for fifo flush\n");
>>> + return -EIO;
>>> + }
>>> +
>>> + udelay(1);
>>> + }
>> It'd be good to put a cpu_relax() in the busy loop.
> Will update in v2.
>>
>>> +static u32 tegra_qspi_setup_transfer_one(struct spi_device *spi,
>>> + struct spi_transfer *t,
>>> + bool is_first_of_msg)
>>> +{
>>> + /* toggle cs to active state */
>>> + if (spi->mode & SPI_CS_HIGH)
>>> + command1 |= QSPI_CS_SW_VAL;
>>> + else
>>> + command1 &= ~QSPI_CS_SW_VAL;
>>> + tegra_qspi_writel(tqspi, command1, QSPI_COMMAND1);
>> This is worrying, the client device might be confused if /CS is doing
>> things outside of the standard handling.
>
> Do you mean to honor spi_transfer cs_change flag?
>
> Tegra QSPI is master and is used only with QSPI flash devices. Looking
> at SPI NOR driver, I see QSPI Flash commands are executed with one
> flash command per spi_message and I dont see cs_change flag usage
> w.r.t QSPI flash. So, using SW based CS control for QSPI.
>
> Please correct me if I miss something to understand here.
>
> Also Tegra186 and later QSPI controller supports combined sequence
> mode where command, address, data phases can be combined in a single GO.
>
> This saves some cycles in transfer and for this we need to use SW
> based CS control only.
>
>
>>> + of_property_read_u32(slave_np, "nvidia,tx-clk-tap-delay",
>>> + &cdata->tx_clk_tap_delay);
>>> + of_property_read_u32(slave_np, "nvidia,rx-clk-tap-delay",
>>> + &cdata->rx_clk_tap_delay);
>> These properties are not mentioned in the binding document.
> Thanks Mark. Missed them. Will add in v2.
>>
>>> +static int tegra_qspi_setup(struct spi_device *spi)
>>> +{
>>> + if (cdata && cdata->tx_clk_tap_delay)
>>> + tx_tap = cdata->tx_clk_tap_delay;
>>> + if (cdata && cdata->rx_clk_tap_delay)
>>> + rx_tap = cdata->rx_clk_tap_delay;
>>> + tqspi->def_command2_reg = QSPI_TX_TAP_DELAY(tx_tap) |
>>> + QSPI_RX_TAP_DELAY(rx_tap);
>>> + tegra_qspi_writel(tqspi, tqspi->def_command2_reg, QSPI_COMMAND2);
>> The setup for one device shouldn't be able to affect the operation of
>> another, already running, device so either these need to be configured
>> as part of the controller probe or these configurations need to be
>> deferred until we're actually doing a transfer.
> We will only have 1 device on QSPI as we only support single chip select.
>>
>>> + /*
>>> + * Tegra QSPI hardware support dummy bytes transfer based on the
>>> + * programmed dummy clock cyles in QSPI register.
>>> + * So, get the total dummy bytes from the dummy bytes transfer in
>>> + * spi_messages and convert to dummy clock cyles.
>>> + */
>>> + list_for_each_entry(xfer, &msg->transfers, transfer_list) {
>>> + if (ntransfers == DUMMY_BYTES_XFER &&
>>> + !(list_is_last(&xfer->transfer_list, &msg->transfers)))
>>> + dummy_cycles = xfer->len * 8 / xfer->tx_nbits;
>>> + ntransfers++;
>>> + }
>> This seems weird, there's some hard coded assumption about particular
>> patterns that the client device is going to send. What's going on here?
>> I don't really understand what this is trying to do.
>
> QSPI flash needs dummy cycles for data read operation which is
> actually the initial read latency and no. of dummy cycles required are
> vendor specific.
>
> SPI NOR driver gets required dummy cycles based on mode clock cycles
> and wait state clock cycles.
>
> During read operations, spi_nor_spimem_read_data() converts dummy
> cycles to number of dummy bytes.
>
> Tegra QSPI controller supports dummy clock cycles register and when
> programmed QSPI controller sends dummy bytes rather than SW handling
> extra cycles for transferring dummy bytes.
>
> Above equation converts this dummy bytes back to dummy clock cycles to
> program into QSPI register and avoid manual SW transfer of dummy bytes.
>
>>
>>> +static irqreturn_t tegra_qspi_isr(int irq, void *context_data)
>>> +{
>>> + struct tegra_qspi_data *tqspi = context_data;
>>> +
>>> + tqspi->status_reg = tegra_qspi_readl(tqspi, QSPI_FIFO_STATUS);
>>> + if (tqspi->cur_direction & DATA_DIR_TX)
>>> + tqspi->tx_status = tqspi->status_reg &
>>> + (QSPI_TX_FIFO_UNF | QSPI_TX_FIFO_OVF);
>>> +
>>> + if (tqspi->cur_direction & DATA_DIR_RX)
>>> + tqspi->rx_status = tqspi->status_reg &
>>> + (QSPI_RX_FIFO_OVF | QSPI_RX_FIFO_UNF);
>>> + tegra_qspi_mask_clear_irq(tqspi);
>>> +
>>> + return IRQ_WAKE_THREAD;
>>> +}
>> It's a bit unclear to me the value we gain from having this handler - if
>> we don't specify a handler genirq will already mask the interrupt until
>> we get to the thread anyway and we could just read the status in the
>> threaded handler. OTOH it doesn't do any harm, just struck me as a bit
>> odd.
>
> I started QSPI driver by taking SPI driver as data transfer and
> interrupt handling are similar.
>
> So kept this handler for clearing status registers and masking
> interrupts as I did not see anything wrong with this.
>
>>
>>> + master = spi_alloc_master(&pdev->dev, sizeof(*tqspi));
>>> + if (!master) {
>>> + dev_err(&pdev->dev, "master allocation failed\n");
>>> + return -ENOMEM;
>>> + }
>> Please switch to using the devm_ version of the API to allocate
>> controller, it makes things much more robust.
> Will update in v2
>>
>>> + if (of_property_read_u32(pdev->dev.of_node, "spi-max-frequency",
>>> + &master->max_speed_hz))
>>> + master->max_speed_hz = QSPI_MAX_SPEED;
>> The core will do this for you.
>
> Will remove this in v2.
>
> Thanks
>
> Sowjanya
>
Powered by blists - more mailing lists