linux-kernel - Re: SDHCI long sleep with interrupts off

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1450350190.3163.93.camel@pengutronix.de>
Date:	Thu, 17 Dec 2015 12:03:10 +0100
From:	Lucas Stach <l.stach@...gutronix.de>
To:	David Jander <david@...tonic.nl>
Cc:	linux-mmc@...r.kernel.org, Pierre Ossman <pierre@...man.eu>,
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: SDHCI long sleep with interrupts off

Am Donnerstag, den 17.12.2015, 11:28 +0100 schrieb David Jander:
> Hi all, 
> 
> I was investigating the source of abnormal irq-latency spikes on an i.MX6
> (ARM) board, and discovered this:
> 
> # tracer: preemptirqsoff
> #
> # preemptirqsoff latency trace v1.1.5 on 4.4.0-rc4+
> # --------------------------------------------------------------------
> # latency: 2068 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:1)
> #    -----------------
> #    | task: mmcqd/0-92 (uid:0 nice:0 policy:0 rt_prio:0)
> #    -----------------
> #  => started at: _raw_spin_lock_irqsave
> #  => ended at:   _raw_spin_unlock_irqrestore
> #
> #
> #                  _------=> CPU#            
> #                 / _-----=> irqs-off        
> #                | / _----=> need-resched    
> #                || / _---=> hardirq/softirq 
> #                ||| / _--=> preempt-depth   
> #                |||| /     delay            
> #  cmd     pid   ||||| time  |   caller      
> #     \   /      |||||  \    |   /         
>  mmcqd/0-92      0d...    1us#: _raw_spin_lock_irqsave
>  mmcqd/0-92      0.n.1 2066us : _raw_spin_unlock_irqrestore
>  mmcqd/0-92      0.n.1 2070us+: trace_preempt_on <-_raw_spin_unlock_irqrestore
>  mmcqd/0-92      0.n.1 2107us : <stack trace>
>  => sdhci_runtime_resume_host
>  => __rpm_callback
>  => rpm_callback
>  => rpm_resume
>  => __pm_runtime_resume
>  => __mmc_claim_host
>  => mmc_blk_issue_rq
>  => mmc_queue_thread
>  => kthread
>  => ret_from_fork
> 
> 2 ms with interrupts disabled!!! To much dismay, I later discovered that this
> isn't even the worst case scenario. I also discovered that this has been in
> the kernel for a long time without a fix (I have tested from 3.17 to 4.4-rc4).
> There has been an attempt by someone to address this back in 2010, but
> apparently it never hit mainline.
> Going through the code in sdhci.c, I found this troublesome code-path:
> 
> sdhci_do_set_ios() {
> 	spin_lock_irqsave(&host->lock, flags);
> 	...
> 	sdhci_reinit() --> sdhci_init() --> sdhci_do_reset() -->
> 		host->ops->reset() --> sdhci_reset()	
> 	...
> 	spin_unlock_irqrestore(&host->lock, flags);
> }
> 
> And in sdhci_reset(), which may be called with held spinlock:
> 
> 	...
> 	/* Wait max 100 ms */
> 	timeout = 100;
> 
> 	/* hw clears the bit when it's done */
> 	while (sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask) {
> 		if (timeout == 0) {
> 			pr_err("%s: Reset 0x%x never completed.\n",
> 				mmc_hostname(host->mmc), (int)mask);
> 			sdhci_dumpregs(host);
> 			return;
> 		}
> 		timeout--;
> 		mdelay(1);
> 	}
> 
> I am wondering: There either must be a reason this hasn't been fixed in such a
> long time, or I am not understanding this correctly, so please enlighten me.
> Before trying a cowboy attempt at "fixing" this, I'd really like to know why am
> I seeing this?
> I mean... how can such a problem get unnoticed and unfixed for so long?
> Will an attempt at fixing this issue even be accepted?
> 
I would like to see the sdhci spinlock killed and replaced by a mutex
for exactly this reason.

That said, your problem is card polling, when no card is present in the
slot. This is most probably caused by CD gpios having the wrong
polarity.

Regards,
Lucas

-- 
Pengutronix e.K.             | Lucas Stach                 |
Industrial Linux Solutions   | http://www.pengutronix.de/  |

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/