lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db31aac0-6b59-8835-2181-635670e7f176@axentia.se>
Date:   Mon, 28 May 2018 17:52:53 +0200
From:   Peter Rosin <peda@...ntia.se>
To:     Boris Brezillon <boris.brezillon@...tlin.com>
Cc:     Tudor Ambarus <tudor.ambarus@...rochip.com>,
        Nicolas Ferre <nicolas.ferre@...rochip.com>,
        Ludovic Desroches <ludovic.desroches@...rochip.com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        Marek Vasut <marek.vasut@...il.com>,
        Josh Wu <rainyfeeling@...look.com>,
        Cyrille Pitchen <cyrille.pitchen@...ev4u.fr>,
        linux-kernel@...r.kernel.org, linux-mtd@...ts.infradead.org,
        Richard Weinberger <richard@....at>,
        Brian Norris <computersforpeace@...il.com>,
        David Woodhouse <dwmw2@...radead.org>,
        linux-arm-kernel@...ts.infradead.org,
        Eugen Hristev <eugen.hristev@...rochip.com>
Subject: Re: [PATCH] mtd: nand: raw: atmel: add module param to avoid using
 dma

On 2018-05-28 16:27, Boris Brezillon wrote:
> Hi Peter,
> 
> On Mon, 28 May 2018 12:10:02 +0200
> Peter Rosin <peda@...ntia.se> wrote:
> 
>> On 2018-05-28 00:11, Peter Rosin wrote:
>>> On 2018-05-27 11:18, Peter Rosin wrote:  
>>>> On 2018-05-25 16:51, Tudor Ambarus wrote:  
>>>>> We think the best way is to keep LCD on DDR Ports 2 and 3 (8th and 9th
>>>>> slaves), to have maximum bandwidth and to use DMA on DDR port 1 for NAND
>>>>> (7th slave).  
>>>>
>>>> Exactly how do I accomplish that?
>>>>
>>>> I can see how I can move the LCD between slave DDR port 2 and 3 by
>>>> selecting LCDC DMA master 8 or 9 (but according to the above it should
>>>> not matter). The big question is how I control what slave the NAND flash
>>>> is going to use? I find nothing in the datasheet, and the code is also
>>>> non-transparent enough for me to figure it out by myself without
>>>> throwing out this question first...  
>>>
>>> I added this:
>>>
>>> diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c b/drivers/mtd/nand/raw/atmel/nand-controller.c
>>> index e686fe73159e..3b33c63d2ed4 100644
>>> --- a/drivers/mtd/nand/raw/atmel/nand-controller.c
>>> +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
>>> @@ -1991,6 +1991,9 @@ static int atmel_nand_controller_init(struct atmel_nand_controller *nc,
>>>  		nc->dmac = dma_request_channel(mask, NULL, NULL);
>>>  		if (!nc->dmac)
>>>  			dev_err(nc->dev, "Failed to request DMA channel\n");
>>> +
>>> +		dev_info(nc->dev, "using %s for DMA transfers\n",
>>> +			 dma_chan_name(nc->dmac));
>>>  	}
>>>  
>>>  	/* We do not retrieve the SMC syscon when parsing old DTs. */
>>>
>>>
>>>
>>> and the output is
>>>
>>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
>>>
>>> So, DMA controller 0 is in use. I still don't know if IF0, IF1 or IF2 is used
>>> or how to find out. I guess IF2 is not in use since that does not allow any
>>> DDR2 port as slave...
>>>
>>> From the datasheet, DMAC0/IF0 uses DDR2 Port 2, and DMAC0/IF1 uses DDR2 Port 1.
>>> But, by the looks of the register content in my other mail, it seems as if
>>> DMA0/IF1 can also use DDR2 Port 3.
>>>
>>> So, I think I want either
>>>
>>> A) the NAND controller to use DMAC0/IF0 (i.e. DDR2 port 1, and possibly 3) and
>>>    the LCDC to use master 9 (i.e. DDR2 Port 2)
>>>
>>> or
>>>
>>> B) the NAND controller to use DMAC1/IF1 (i.e. DDR2 port 2) and the LCDC to use
>>>    master 8 (i.e. DDR2 Port 3)  
>>
>> Crap, that was not what I meant to express. Sorry for the confusion. This is
>> better.
>>
>> So, I think I want either
>>
>> A) the NAND controller to use master 1 DMAC0/IF0 (i.e. slave 8 DDR2 port 2) and
>>    the LCDC to use master 9 (i.e. slave 9 DDR2 Port 3)
>>
>> or
>>
>> B) the NAND controller to use master 2 DMAC0/IF1 (i.e. slave 7 DDR2 port 1, and
>>    possibly slave 9 DDR2 port 3 (if my previous findings are relevant) and the
>>    LCDC to use master 8 (i.e. slave 8 DDR2 Port 2)
>>
>>> But, again, how do I limit DMAC0 to either of IF0 or IF1 for NAND accesses?  
>>
>> So, I added a horrid patch (attached), which mainly adds printk lines, but
>> additionally does one more thing in atc_prep_dma_memcpy. It changes the DSCR_IF
>> field (from 0) to 1 for DMA-memcpy for dma0chan5 (i.e. the NAND). At least I
>> think it does that?
>>
>> Running with that patch gets me this:
>>
>> # dmesg | grep -i dma
>> at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
>> at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set slave ), 8 channels
>> dma dma0chan0: xlate 0 2
>> dma dma0chan1: xlate 0 2
>> at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
>> dma dma1chan0: xlate 0 2
>> dma dma1chan1: xlate 0 2
>> at91_i2c f801c000.i2c: using dma1chan0 (tx) and dma1chan1 (rx) for DMA transfers
>> dma dma0chan2: xlate 0 2
>> dma dma0chan3: xlate 0 2
>> dma dma0chan4: xlate 0 2
>> atmel_mci f0000000.mmc: using dma0chan4 for DMA transfers
>> dma dma1chan2: xlate 0 2
>> dma dma1chan3: xlate 0 2
>> atmel_aes f8038000.aes: Atmel AES - Using dma1chan2, dma1chan3 for DMA transfers
>> dma dma1chan4: xlate 0 2
>> atmel_sha f8034000.sha: using dma1chan4 for DMA transfers
>> dma dma1chan5: xlate 0 2
>> dma dma1chan6: xlate 0 2
>> atmel_tdes f803c000.tdes: using dma1chan5, dma1chan6 for DMA transfers
>> atmel-nand-controller 10000000.ebi:nand-controller: using dma0chan5 for DMA transfers
>> dma dma0chan5: memcpy: 0
>> dma dma0chan5: DSCR_IF: 1
>> dma dma0chan5: memcpy: 1
>>
>> So, output is as expected and I believe that the patch makes the NAND DMA
>> accesses use master 2 DMAC0/IF1 and are thus forced to use slave 7 DDR2 Port 1
>> (and possibly 9). The LCDC is using slave 8 DDR2 Port 2. So there should be no
>> slave conflict?
>>
>> But the on-screen crap remains during NAND accesses.
>>
>> But pressing on.
>>
>> I then changed the priorities for all accesses to 0 in the PRxSy registers, except
>> the ones for masters 8/9 LCDC (slaves 8/9) which I left at priority 3.
>>
>> But the on-screen crap remains during NAND accesses.
>>
>> My guess is that the NAND DMA is doing too long bursts and that the LCDC therefore
>> has to wait too long and simply fails to keep the pipeline from running short?
>>
>> So I tried to reduce the maximum SLOT_CYCLE for slaves 7 and 9 in the SCFGx
>> registers. No noticeable effect either.
>>
>> I then tried to split bursts from master 2 (DMAC0/IF1) with small values in the
>> MCFG2 register. No effect.
>>
>> I'm getting nowhere.
> 
> Could it just be that you're reaching the DDR bus limit. As I said
> previously, when you go through the CPU, and assuming you're consuming
> the data directly, you have:
> 
> 1/ NFC SRAM -> CPU
> 2/ CPU -> L1 data cache --write-back--> DRAM
> 3/ L1-cache -> CPU
> 
> While, if you use DMA you get:
> 
> 1/ NFC SRAM -> DRAM
> 2/ SDRAM -> L1 data cache -> CPU
> 
> So, if you're approaching the limit of (LP)DDR bandwidth, using the CPU
> might make things a bit better. Still, if the limitation really comes
> from the DDR bus, my opinion is that you should maybe use a smaller
> resolution or use a more compact pixel format (RGB565?).

The issue is still there if I use a CLUT mode instead of rgb565, which is
what I normally use (and what I would like to use, CLUT is just alien and
a pain these days).

The panels we are using only supports one resolution (each), but the issue
is there with both 1920x1080@...pp and 1024x768@...p (~60Hz).

> Did you calculate how much of the bandwidth is taken by the HLCDC
> block and compared it to the max (LP)DDR bandwidth?

I did, but don't remember the exact details. There is some room even for
1920x1080@...pp, but not oceans of it. We were a bit uncertain if 16bpp
would be possible, and in fact that was the reason I worked on CLUT
support for atmel-hlcdc last year. But since the problem persists with
much less memory pressure as well, I don't think that's it either.

Admittedly I have not tested these AHB matrix tricks with a smaller
panel (it would take a bit of work to arrange for those tests), but the
issue was there when I last tried (using defaults).

Cheers,
Peter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ