linux-kernel - Re: [PATCH] mtd: nand: raw: atmel: add module param to avoid using dma

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 May 2018 23:37:50 +0200
From:   Peter Rosin <peda@...ntia.se>
To:     Boris Brezillon <boris.brezillon@...tlin.com>,
        Eugen Hristev <eugen.hristev@...rochip.com>
Cc:     Tudor Ambarus <tudor.ambarus@...rochip.com>,
        Nicolas Ferre <nicolas.ferre@...rochip.com>,
        Ludovic Desroches <ludovic.desroches@...rochip.com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        Marek Vasut <marek.vasut@...il.com>,
        Josh Wu <rainyfeeling@...look.com>,
        Cyrille Pitchen <cyrille.pitchen@...ev4u.fr>,
        linux-kernel@...r.kernel.org, linux-mtd@...ts.infradead.org,
        Richard Weinberger <richard@....at>,
        Brian Norris <computersforpeace@...il.com>,
        David Woodhouse <dwmw2@...radead.org>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] mtd: nand: raw: atmel: add module param to avoid using
 dma

Hi again!

I have spent some hours bringing out the old hardware with the 1024x768
panel from underneath the usual piles of junk and layers of dust...and
tried the current kernel on that one. And the display was stable even
when stressing with lots of NAND accesses.  *boggle*

Then I remembered that I had lowered the pixel clock (from 71.1 MHz to
65 MHz (and reduced the vertical blanking to maintain the refresh rate).
I didn't notice that this fixed the NAND interference, probably because
I ran NAND without DMA at the time? Anyway, if I reset the pixel clock
to 71.1 MHz (without increasing the vertical blanking, just to be nasty)
I can get the artifacts easily. But running with a pixel clock of 65 MHz
is not a problem at all, so we can consider NAND-DMA with that panel
solved.

However, now we know that this setup needs relatively little to start
working, and that might be good if we want to see if other changes has
any effect. I will look into that tomorrow. And we can also get a grip
on the critical bandwidth.

But first, answers to some random questions...


On 2018-05-29 09:25, Eugen Hristev wrote:
> One more thing: what are the actual nand commands which you use when you 
> get the glitches? read/write/erase ... ?

Erase seems to be least sensitive, read or write are worse (and similar)
according to my unscientific observations.

> What happens if you try to minimize the nand access? you also said at 
> some point that only *some* nand accesses cause glitches.

These systems will normally not access the NAND, but the displays look
like total crap when this happens. It can happen even when sync()ing
small files, but doesn't happen for every little file. Writing out or
reading a large file to/from NAND invariably triggers the issue.

> Another thing : even if the LCD displays a still image, the DMA still 
> feeds data to the LCD right ?

Absolutely. But since we are not playing some large video file (which
could have been stored on the NAND) we typically don't see the problem. It
only turns up in special circumstances. But these circumstances can't be
avoided and the display looks so freaking ugly when it happens...



On 2018-05-29 17:01, Eugen Hristev wrote:
> Then we can then try to force NFC SRAM DMA channels to use just DDR port 
> 1 or 2 for memcpy ?

I *think* my "horrid" patch does that. Specifically this line

+			desc->txd.phys = (desc->txd.phys & ~3) | 1;




On 2018-05-28 18:09, Nicolas Ferre wrote:
> Can you try to make all that you can to maximize the blanking period of 
> your screen (some are more tolerant than others according to that). By 
> doing so, you would allow the LCD FIFO to recover better after each 
> line. You might loose some columns on the side of your display but it 
> would give us a good idea of how far we are from getting rid of those 
> annoying LCD reset glitches (that are due to underruns on LCD FIFO).

I noticed that the 1024x768 panel is using 24bpp, not 16bpp as I
stated previously. Also, the horizontal blanking is 320 pixels, so a
total of 1024+320=1344 pixels/row and a pixel clock of 71.1 Mhz yields
18.9 us/row. The needed data during that time is 1024*24 bits so
1.30 Gbit/s. For the 65 MHz pixel clock, I get 1.19 Gbit/s. Assuming,
of course, that the pixel clock is actually what was requested... What
is the granularity of the pixel clock anyway?

For the bigger 1920x1080 panel, I have a horizontal blanking of 200
pixels and a pixel clock of 144 MHz, so 14.7 us/row -> 2.09 Gbit/s.
I suspect that no amount of fiddling with blanking is going to get
that anyway near the needed ~1.25 Gbit/s. Besides, the specs of the
panel say that the maximum horizontal blanking time is 280 pixels.
Seems futile to even try since this horizontal blanking time is so
much shorter for the larger panel (fewer and faster pixels) and the
longer time wasn't enough for the smaller panel to catch up. But ok,
in combination with something else it might be just enough. Will try
tomorrow...


On 2018-05-28 18:09, Boris Brezillon wrote:
> On Mon, 28 May 2018 17:52:53 +0200 Peter Rosin <peda@...ntia.se> wrote:
>> The panels we are using only supports one resolution (each), but the issue
>> is there with both 1920x1080@...pp and 1024x768@...p (~60Hz).
> 
> Duh! This adds to the weirdness of this issue. I'd thought that by
> dividing the required bandwidth by 2 you would get a reliable setup.

I think I might have misremembered seeing the issue with 1024x768@...p.
Sorry. But it *is* there for (the old variant of) 1024x768@...pp, and
that is still only 60% or so of the bandwidth compared to 1920x1080@...pp.

Cheers,
Peter