linux-kernel - Re: [PATCH v4 4/6] mtd: rawnand: add NVIDIA Tegra NAND Flash controller driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7985c118-af47-624e-bfcc-cd5d8726742a@kernel.dk>
Date:   Tue, 12 Jun 2018 15:24:41 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Stefan Agner <stefan@...er.ch>
Cc:     Boris Brezillon <boris.brezillon@...tlin.com>,
        Dmitry Osipenko <digetx@...il.com>, dwmw2@...radead.org,
        computersforpeace@...il.com, marek.vasut@...il.com,
        robh+dt@...nel.org, mark.rutland@....com, thierry.reding@...il.com,
        dev@...xeye.de, miquel.raynal@...tlin.com, richard@....at,
        marcel@...wiler.com, krzk@...nel.org, benjamin.lindqvist@...ian.se,
        jonathanh@...dia.com, pdeschrijver@...dia.com, pgaikwad@...dia.com,
        mirza.krak@...il.com, gaireg@...reg.de,
        linux-mtd@...ts.infradead.org, linux-tegra@...r.kernel.org,
        devicetree@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 4/6] mtd: rawnand: add NVIDIA Tegra NAND Flash
 controller driver

On 6/12/18 2:20 PM, Stefan Agner wrote:
> On 12.06.2018 17:24, Jens Axboe wrote:
>> On 6/12/18 3:17 AM, Stefan Agner wrote:
>>> [also added Jens Axboe]
>>>
>>> On 12.06.2018 10:27, Boris Brezillon wrote:
>>>> On Tue, 12 Jun 2018 10:06:42 +0200
>>>> Stefan Agner <stefan@...er.ch> wrote:
>>>>
>>>>> On 12.06.2018 02:03, Dmitry Osipenko wrote:
>>>>>> On Monday, 11 June 2018 23:52:22 MSK Stefan Agner wrote:
>>>>>>> Add support for the NAND flash controller found on NVIDIA
>>>>>>> Tegra 2 SoCs. This implementation does not make use of the
>>>>>>> command queue feature. Regular operations/data transfers are
>>>>>>> done in PIO mode. Page read/writes with hardware ECC make
>>>>>>> use of the DMA for data transfer.
>>>>>>>
>>>>>>> Signed-off-by: Lucas Stach <dev@...xeye.de>
>>>>>>> Signed-off-by: Stefan Agner <stefan@...er.ch>
>>>>>>> ---
>>>>>>>  MAINTAINERS                       |    7 +
>>>>>>>  drivers/mtd/nand/raw/Kconfig      |    6 +
>>>>>>>  drivers/mtd/nand/raw/Makefile     |    1 +
>>>>>>>  drivers/mtd/nand/raw/tegra_nand.c | 1248 +++++++++++++++++++++++++++++
>>>>>>>  4 files changed, 1262 insertions(+)
>>>>>>>  create mode 100644 drivers/mtd/nand/raw/tegra_nand.c
>>>>>>>
>>>>> [snip]
>>>>>>> +static int tegra_nand_cmd(struct nand_chip *chip,
>>>>>>> +			 const struct nand_subop *subop)
>>>>>>> +{
>>>>>>> +	const struct nand_op_instr *instr;
>>>>>>> +	const struct nand_op_instr *instr_data_in = NULL;
>>>>>>> +	struct tegra_nand_controller *ctrl = to_tegra_ctrl(chip->controller);
>>>>>>> +	unsigned int op_id, size = 0, offset = 0;
>>>>>>> +	bool first_cmd = true;
>>>>>>> +	u32 reg, cmd = 0;
>>>>>>> +	int ret;
>>>>>>> +
>>>>>>> +	for (op_id = 0; op_id < subop->ninstrs; op_id++) {
>>>>>>> +		unsigned int naddrs, i;
>>>>>>> +		const u8 *addrs;
>>>>>>> +		u32 addr1 = 0, addr2 = 0;
>>>>>>> +
>>>>>>> +		instr = &subop->instrs[op_id];
>>>>>>> +
>>>>>>> +		switch (instr->type) {
>>>>>>> +		case NAND_OP_CMD_INSTR:
>>>>>>> +			if (first_cmd) {
>>>>>>> +				cmd |= COMMAND_CLE;
>>>>>>> +				writel_relaxed(instr->ctx.cmd.opcode,
>>>>>>> +					       ctrl->regs + CMD_REG1);
>>>>>>> +			} else {
>>>>>>> +				cmd |= COMMAND_SEC_CMD;
>>>>>>> +				writel_relaxed(instr->ctx.cmd.opcode,
>>>>>>> +					       ctrl->regs + CMD_REG2);
>>>>>>> +			}
>>>>>>> +			first_cmd = false;
>>>>>>> +			break;
>>>>>>> +		case NAND_OP_ADDR_INSTR:
>>>>>>> +			offset = nand_subop_get_addr_start_off(subop, op_id);
>>>>>>> +			naddrs = nand_subop_get_num_addr_cyc(subop, op_id);
>>>>>>> +			addrs = &instr->ctx.addr.addrs[offset];
>>>>>>> +
>>>>>>> +			cmd |= COMMAND_ALE | COMMAND_ALE_SIZE(naddrs);
>>>>>>> +			for (i = 0; i < min_t(unsigned int, 4, naddrs); i++)
>>>>>>> +				addr1 |= *addrs++ << (BITS_PER_BYTE * i);
>>>>>>> +			naddrs -= i;
>>>>>>> +			for (i = 0; i < min_t(unsigned int, 4, naddrs); i++)
>>>>>>> +				addr2 |= *addrs++ << (BITS_PER_BYTE * i);
>>>>>>> +			writel_relaxed(addr1, ctrl->regs + ADDR_REG1);
>>>>>>> +			writel_relaxed(addr2, ctrl->regs + ADDR_REG2);
>>>>>>> +			break;
>>>>>>> +
>>>>>>> +		case NAND_OP_DATA_IN_INSTR:
>>>>>>> +			size = nand_subop_get_data_len(subop, op_id);
>>>>>>> +			offset = nand_subop_get_data_start_off(subop, op_id);
>>>>>>> +
>>>>>>> +			cmd |= COMMAND_TRANS_SIZE(size) | COMMAND_PIO |
>>>>>>> +				COMMAND_RX | COMMAND_A_VALID;
>>>>>>> +
>>>>>>> +			instr_data_in = instr;
>>>>>>> +			break;
>>>>>>> +
>>>>>>> +		case NAND_OP_DATA_OUT_INSTR:
>>>>>>> +			size = nand_subop_get_data_len(subop, op_id);
>>>>>>> +			offset = nand_subop_get_data_start_off(subop, op_id);
>>>>>>> +
>>>>>>> +			cmd |= COMMAND_TRANS_SIZE(size) | COMMAND_PIO |
>>>>>>> +				COMMAND_TX | COMMAND_A_VALID;
>>>>>>> +
>>>>>>> +			memcpy(&reg, instr->ctx.data.buf.out + offset, size);
>>>>>>> +			writel_relaxed(reg, ctrl->regs + RESP);
>>>>>>> +
>>>>>>> +			break;
>>>>>>> +		case NAND_OP_WAITRDY_INSTR:
>>>>>>> +			cmd |= COMMAND_RBSY_CHK;
>>>>>>> +			break;
>>>>>>> +
>>>>>>> +		}
>>>>>>> +	}
>>>>>>> +
>>>>>>> +	cmd |= COMMAND_GO | COMMAND_CE(ctrl->cur_cs);
>>>>>>> +	writel_relaxed(cmd, ctrl->regs + COMMAND);
>>>>>>> +	ret = wait_for_completion_io_timeout(&ctrl->command_complete,
>>>>>>> +					     msecs_to_jiffies(500));
>>>>>>
>>>>>> It's not obvious to me whether _io_ variant is appropriate to use here, would
>>>>>> be nice if somebody could clarify that. Maybe block/ already does the IO
>>>>>> accounting itself and hence the IO time would be counted twice in that case.
>>>>>
>>>>> Good that you bring this up.
>>>>>
>>>>> I don't think that there is any higher layer which could take care of
>>>>> accounting. Usually, with raw nand there is no block layer involved
>>>>> anyway.
>>>>>
>>>>> In a quick test it seems that only when using wait_for_completion_io I/O
>>>>> is properly accounted in the "wait" section of top.
>>>>>
>>>>> So far only a single driver (omap2) used the _io variant, but I think it
>>>>> is the right thing to do! After all, it is I/O...
>>>>>
>>>>> Boris or any other MTD maintainer, any comment on this?
>>>>
>>>> Given this definition of io_schedule_timeout() [1] (which is used when
>>>> you call wait_for_completion_io_timeout()), I'd say it's not useful to
>>>> use the _io_ version, simply because MTD devs are not exposed as blk
>>>> devices, and thus don't need the blk_schedule_flush_plug() that is done
>>>> is io_schedule_prepare(). But that also means MTD I/Os are not
>>>> accounted as I/Os :-(.
>>>
>>> Documentation of wait_for_completion_io says:
>>> "The caller is accounted as waiting for IO (which traditionally means
>>> blkio only)."
>>>
>>> Which sounds as if it using _io is only an accounting thing...
>>
>> Yes, you should only use it for waiting for IO off a system call
>> read path. So block IO, or file system IO. Don't use it for internal
>> IO that isn't related to that.
> 
> I guess that would be the case here, since MTD page read/writes are
> typically file system IOs (e.g. UBIFS).
> 
> The problem is just that is not block related at all since it uses the
> MTD subsystem... And it seems that the _io variants besides accounting,
> also take a role in the block subsystems device plugging mechanism. What
> is unclear to me if using the _io variant from the MTD subsystem
> potentially disturbs the plugging mechanism...

No, it has nothing to do with plugging at the block level. So if you're
doing regular user IO, then you should use the _io variants.

-- 
Jens Axboe