linux-kernel - Re: [PATCH v4 4/6] mtd: rawnand: add NVIDIA Tegra NAND Flash controller driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180612113449.79dddb38@bbrezillon>
Date:   Tue, 12 Jun 2018 11:34:49 +0200
From:   Boris Brezillon <boris.brezillon@...tlin.com>
To:     Stefan Agner <stefan@...er.ch>
Cc:     axboe@...nel.dk, Dmitry Osipenko <digetx@...il.com>,
        dwmw2@...radead.org, computersforpeace@...il.com,
        marek.vasut@...il.com, robh+dt@...nel.org, mark.rutland@....com,
        thierry.reding@...il.com, dev@...xeye.de,
        miquel.raynal@...tlin.com, richard@....at, marcel@...wiler.com,
        krzk@...nel.org, benjamin.lindqvist@...ian.se,
        jonathanh@...dia.com, pdeschrijver@...dia.com, pgaikwad@...dia.com,
        mirza.krak@...il.com, gaireg@...reg.de,
        linux-mtd@...ts.infradead.org, linux-tegra@...r.kernel.org,
        devicetree@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 4/6] mtd: rawnand: add NVIDIA Tegra NAND Flash
 controller driver

On Tue, 12 Jun 2018 11:17:09 +0200
Stefan Agner <stefan@...er.ch> wrote:

> [also added Jens Axboe]
> 
> On 12.06.2018 10:27, Boris Brezillon wrote:
> > On Tue, 12 Jun 2018 10:06:42 +0200
> > Stefan Agner <stefan@...er.ch> wrote:
> >   
> >> On 12.06.2018 02:03, Dmitry Osipenko wrote:  
> >> > On Monday, 11 June 2018 23:52:22 MSK Stefan Agner wrote:  
> >> >> Add support for the NAND flash controller found on NVIDIA
> >> >> Tegra 2 SoCs. This implementation does not make use of the
> >> >> command queue feature. Regular operations/data transfers are
> >> >> done in PIO mode. Page read/writes with hardware ECC make
> >> >> use of the DMA for data transfer.
> >> >>
> >> >> Signed-off-by: Lucas Stach <dev@...xeye.de>
> >> >> Signed-off-by: Stefan Agner <stefan@...er.ch>
> >> >> ---
> >> >>  MAINTAINERS                       |    7 +
> >> >>  drivers/mtd/nand/raw/Kconfig      |    6 +
> >> >>  drivers/mtd/nand/raw/Makefile     |    1 +
> >> >>  drivers/mtd/nand/raw/tegra_nand.c | 1248 +++++++++++++++++++++++++++++
> >> >>  4 files changed, 1262 insertions(+)
> >> >>  create mode 100644 drivers/mtd/nand/raw/tegra_nand.c
> >> >>  
> >> [snip]  
> >> >> +static int tegra_nand_cmd(struct nand_chip *chip,
> >> >> +			 const struct nand_subop *subop)
> >> >> +{
> >> >> +	const struct nand_op_instr *instr;
> >> >> +	const struct nand_op_instr *instr_data_in = NULL;
> >> >> +	struct tegra_nand_controller *ctrl = to_tegra_ctrl(chip->controller);
> >> >> +	unsigned int op_id, size = 0, offset = 0;
> >> >> +	bool first_cmd = true;
> >> >> +	u32 reg, cmd = 0;
> >> >> +	int ret;
> >> >> +
> >> >> +	for (op_id = 0; op_id < subop->ninstrs; op_id++) {
> >> >> +		unsigned int naddrs, i;
> >> >> +		const u8 *addrs;
> >> >> +		u32 addr1 = 0, addr2 = 0;
> >> >> +
> >> >> +		instr = &subop->instrs[op_id];
> >> >> +
> >> >> +		switch (instr->type) {
> >> >> +		case NAND_OP_CMD_INSTR:
> >> >> +			if (first_cmd) {
> >> >> +				cmd |= COMMAND_CLE;
> >> >> +				writel_relaxed(instr->ctx.cmd.opcode,
> >> >> +					       ctrl->regs + CMD_REG1);
> >> >> +			} else {
> >> >> +				cmd |= COMMAND_SEC_CMD;
> >> >> +				writel_relaxed(instr->ctx.cmd.opcode,
> >> >> +					       ctrl->regs + CMD_REG2);
> >> >> +			}
> >> >> +			first_cmd = false;
> >> >> +			break;
> >> >> +		case NAND_OP_ADDR_INSTR:
> >> >> +			offset = nand_subop_get_addr_start_off(subop, op_id);
> >> >> +			naddrs = nand_subop_get_num_addr_cyc(subop, op_id);
> >> >> +			addrs = &instr->ctx.addr.addrs[offset];
> >> >> +
> >> >> +			cmd |= COMMAND_ALE | COMMAND_ALE_SIZE(naddrs);
> >> >> +			for (i = 0; i < min_t(unsigned int, 4, naddrs); i++)
> >> >> +				addr1 |= *addrs++ << (BITS_PER_BYTE * i);
> >> >> +			naddrs -= i;
> >> >> +			for (i = 0; i < min_t(unsigned int, 4, naddrs); i++)
> >> >> +				addr2 |= *addrs++ << (BITS_PER_BYTE * i);
> >> >> +			writel_relaxed(addr1, ctrl->regs + ADDR_REG1);
> >> >> +			writel_relaxed(addr2, ctrl->regs + ADDR_REG2);
> >> >> +			break;
> >> >> +
> >> >> +		case NAND_OP_DATA_IN_INSTR:
> >> >> +			size = nand_subop_get_data_len(subop, op_id);
> >> >> +			offset = nand_subop_get_data_start_off(subop, op_id);
> >> >> +
> >> >> +			cmd |= COMMAND_TRANS_SIZE(size) | COMMAND_PIO |
> >> >> +				COMMAND_RX | COMMAND_A_VALID;
> >> >> +
> >> >> +			instr_data_in = instr;
> >> >> +			break;
> >> >> +
> >> >> +		case NAND_OP_DATA_OUT_INSTR:
> >> >> +			size = nand_subop_get_data_len(subop, op_id);
> >> >> +			offset = nand_subop_get_data_start_off(subop, op_id);
> >> >> +
> >> >> +			cmd |= COMMAND_TRANS_SIZE(size) | COMMAND_PIO |
> >> >> +				COMMAND_TX | COMMAND_A_VALID;
> >> >> +
> >> >> +			memcpy(&reg, instr->ctx.data.buf.out + offset, size);
> >> >> +			writel_relaxed(reg, ctrl->regs + RESP);
> >> >> +
> >> >> +			break;
> >> >> +		case NAND_OP_WAITRDY_INSTR:
> >> >> +			cmd |= COMMAND_RBSY_CHK;
> >> >> +			break;
> >> >> +
> >> >> +		}
> >> >> +	}
> >> >> +
> >> >> +	cmd |= COMMAND_GO | COMMAND_CE(ctrl->cur_cs);
> >> >> +	writel_relaxed(cmd, ctrl->regs + COMMAND);
> >> >> +	ret = wait_for_completion_io_timeout(&ctrl->command_complete,
> >> >> +					     msecs_to_jiffies(500));  
> >> >
> >> > It's not obvious to me whether _io_ variant is appropriate to use here, would
> >> > be nice if somebody could clarify that. Maybe block/ already does the IO
> >> > accounting itself and hence the IO time would be counted twice in that case.  
> >>
> >> Good that you bring this up.
> >>
> >> I don't think that there is any higher layer which could take care of
> >> accounting. Usually, with raw nand there is no block layer involved
> >> anyway.
> >>
> >> In a quick test it seems that only when using wait_for_completion_io I/O
> >> is properly accounted in the "wait" section of top.
> >>
> >> So far only a single driver (omap2) used the _io variant, but I think it
> >> is the right thing to do! After all, it is I/O...
> >>
> >> Boris or any other MTD maintainer, any comment on this?  
> > 
> > Given this definition of io_schedule_timeout() [1] (which is used when
> > you call wait_for_completion_io_timeout()), I'd say it's not useful to
> > use the _io_ version, simply because MTD devs are not exposed as blk
> > devices, and thus don't need the blk_schedule_flush_plug() that is done
> > is io_schedule_prepare(). But that also means MTD I/Os are not
> > accounted as I/Os :-(.  
> 
> Documentation of wait_for_completion_io says:
> "The caller is accounted as waiting for IO (which traditionally means
> blkio only)."
> 
> Which sounds as if it using _io is only an accounting thing...
> 
> The hint about blkio might suggest that there is more to it.
> 
> Is calling blk_schedule_flush_plug a problem for MTD?
> https://elixir.bootlin.com/linux/v4.17.1/source/include/linux/blkdev.h#L1355

It shouldn't. It might add a bit of work when a blk_plug is attached to
the task that is about to wait for I/Os (I honestly don't know what a
blk_plug is :-)).

> 
> > 
> > Let's go for the non-io version for now, since all drivers except omap2
> > seem to use this function.
> >   
> 
> I still think it would be nice and "the right thing" from a user
> perspective...

If we decide to switch to wait_for_completion_io_timeout(), I'd prefer
to do that for all drivers rather than just this one, and I'm still not
sure this is the right thing to do given the comment you pointed out
("which traditionally means blkio only").