lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <FA636A4D-FA8F-48EE-80C4-EDDFD115FB25@goldelico.com>
Date:   Sat, 4 Jun 2022 12:16:28 +0200
From:   "H. Nikolaus Schaller" <hns@...delico.com>
To:     Ulf Hansson <ulf.hansson@...aro.org>
Cc:     Discussions about the Letux Kernel <letux-kernel@...nphoenux.org>,
        kernel@...a-handheld.com, aTc <atc@...-p.org>,
        Tony Lindgren <tony@...mide.com>,
        linux-omap <linux-omap@...r.kernel.org>,
        linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-mmc@...r.kernel.org
Subject: Re: BUG in mmc: core: Disable card detect during shutdown

Hi,

> Am 03.06.2022 um 12:46 schrieb Ulf Hansson <ulf.hansson@...aro.org>:
> 
> On Mon, 30 May 2022 at 18:55, H. Nikolaus Schaller <hns@...delico.com> wrote:
>> 
>> Hi Ulf,
>> users did report a strange issue that the OMAP5 based Pyra does not
>> shutdown if a kernel 5.10.116 is used.
>> 

...

>> mmc_stop_host() is not called but __mmc_stop_host() is called 4 times.
>> There are 4 active MMC interfaces in the Pyra - 3 for (µ)SD slots
>> and one for an SDIO WLAN module.
>> 
>> Now it looks as if 3 of them are properly teared down (two of them
>> seem to have host->slot.cd_irq >= 0) but on the fourth call
>> cancel_delayed_work_sync(&host->detect); does not return. This is
>> likely the location of the stall why we don't see a "reboot: Power down"
>> 
>> Any ideas?
> 
> I guess the call to cancel_delayed_work_sync() in __mmc_stop_host()
> hangs for one of the mmc hosts. This shouldn't happen - and indicates
> that there is something else being wrong.

Yes, you were right...

> 
> See more suggestions below.
> 
>> 
>> BR and thanks,
>> Nikolaus
>> 
>> printk hack:
>> 
>> void __mmc_stop_host(struct mmc_host *host)
>> {
>> printk("%s 1\n", __func__);
>>        if (host->slot.cd_irq >= 0) {
>> printk("%s 2\n", __func__);
>>                mmc_gpio_set_cd_wake(host, false);
>> printk("%s 3\n", __func__);
>>                disable_irq(host->slot.cd_irq);
>> printk("%s 4\n", __func__);
>>        }
>> 
>>        host->rescan_disable = 1;
>> printk("%s 5\n", __func__);
> 
> My guess is that it's the same mmc host that causes the hang. I
> suggest you print the name of the host too, to verify that. Something
> along the lines of the below.
> 
> printk("%s: %s 5\n", mmc_hostname(host), __func__);

To my surprise, this did report an mmc6 host port where the OMAP5 only has 4...

Yes, we have a special driver for the txs02612 sdio switch and voltage translator
chip to make two ports out of the single mmc2 port of the OMAP5 SoC.

This driver was begun ca. 7 years ago but never finished...

The idea is to make a mmc port have several subports. For the Pyra handheld hardware
we needed 5 mmc/sdio interfaces but the omap5 only has 4 of them available to us.

So the txs02612 drivers is sitting between the omap5 mmc2 host pins and switches
between an µSD slot and an eMMC.

Therefore, the driver is a mmc client driver (like e.g. the driver of some WiFi chip
connected to some SDIO port) and provides multiple mmc host interfaces.

It should intercept data transfer requests to its multiple mmc hosts, synchronize
(or enqueue) them, control the switch gpio and forward requests to the processor's
mmc host port so that they are processed (after switching).

We never continued to make this work...

What remained is simple code to manually throw the switch through some /sysfs
control file after doing an eject and before a fresh partprobe.

Still, the probe function of the txs02612 driver does two calls to mmc_add_host().
These seem to make 

> 
>>        cancel_delayed_work_sync(&host->detect);

get stuck. Most likely because the initialization is not complete for handling
card detection.

>> 
>> --- here should be another __mmc_stop_host 6
>> --- and reboot: Power down
> 
> When/if you figured out that it's the same host that hangs, you could
> try to disable that host through the DTS files (add status =
> "disabled" in the device node, for example) - and see if that works.

When not calling mmc_add_host() in our txs02612 driver fragment we can
properly shut down the OMAP5. That is the solution with the least efforts.
The other would be to make the txs02612 properly work...

So in summary there is no bug upstream. It is in our tree.

If you are interested in how our code fragment for the txs02612 looks like:

https://git.goldelico.com/?p=letux-kernel.git;a=shortlog;h=refs/heads/letux/txs02612

Maybe you have some suggestions to make it work?

> 
> Kind regards
> Uffe

BR and thanks,
Nikolaus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ