lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50218bf7-ee13-87d2-6498-e613220f9931@gmail.com>
Date:   Tue, 18 Oct 2022 22:47:57 +0200
From:   Ferry Toth <fntoth@...il.com>
To:     Andrey Smirnov <andrew.smirnov@...il.com>
Cc:     Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Felipe Balbi <balbi@...nel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH v2 2/2] Revert "usb: dwc3: Don't switch OTG -> peripheral
 if extcon is present"

Hi,

Op 17-10-2022 om 23:20 schreef Andrey Smirnov:
> On Sun, Oct 16, 2022 at 1:59 PM Ferry Toth <fntoth@...il.com> wrote:
>>
>> Op 15-10-2022 om 21:54 schreef Andrey Smirnov:
>>> On Thu, Oct 13, 2022 at 12:35 PM Ferry Toth <fntoth@...il.com> wrote:
>>>> <SNIP>
>>>>> My end goal here is to find a way to test vanilla v6.0 with the two
>>>>> patches reverted on your end. I thought that during my testing I saw
>>>>> tusb1210 print those timeout messages during its probe and that
>>>>> disabling the driver worked to break the loop, but I went back to
>>>>> double check and it doesn't work so scratch that idea. Configuring
>>>>> extcon as a built-in breaks host functionality with or without patches
>>>>> on my end, so I'm not sure it could be a path.
>>>>>
>>>>> I won't have time to try things with
>>>>> 0043b-TODO-driver-core-Break-infinite-loop-when-deferred-p.patch until
>>>>> the weekend, meanwhile can you give this diff a try with vanilla (no
>>>>> reverts) v6.0:
>>>>>
>>> OK, got a chance to try things with that patch. Both v6.0 and v6.0
>>> with my patches reverted work the same, my Kingston DataTraveller USB
>>> stick enumerates and works as expected.
>>>
>> Iow you don't need the patch at all to get usb to work. There has got to
>> be a difference in our configs.
>>
> My patch? Yeah, it should have zero effect on anything.
> !DWC3_VER_IS_PRIOR(DWC3, 330A) is false for Merrifield, so the logical
> change from my patch is a no-op. It's a pure coincidence that it
> resolved the probe loop that
> 0043b-TODO-driver-core-Break-infinite-loop-when-deferred-p.patch is
> for.
>
>> Did you have a chance to look at mine (here:
>> https://drive.google.com/file/d/1aKJWMqiAXnReeLCvxshzjKwGxIWQ7eJk/view?usp=sharing)
>>
>> Else, send me yours.
>>
> I've been using your config in all of the testing.
>
>>>>> modified   drivers/phy/ti/phy-tusb1210.c
>>>>> @@ -127,6 +127,7 @@ static int tusb1210_set_mode(struct phy *phy, enum
>>>>> phy_mode mode, int submode)
>>>>>      u8 reg;
>>>>>
>>>>>      ret = tusb1210_ulpi_read(tusb, ULPI_OTG_CTRL, &reg);
>>>>> + WARN_ON(ret < 0);
>>>>>      if (ret < 0)
>>>>>      return ret;
>>>>>
>>>>> @@ -152,7 +153,10 @@ static int tusb1210_set_mode(struct phy *phy,
>>>>> enum phy_mode mode, int submode)
>>>>>      }
>>>>>
>>>>>      tusb->otg_ctrl = reg;
>>>>> - return tusb1210_ulpi_write(tusb, ULPI_OTG_CTRL, reg);
>>>>> + ret = tusb1210_ulpi_write(tusb, ULPI_OTG_CTRL, reg);
>>>>> + WARN_ON(ret < 0);
>>>>> + return ret;
>>>>> +
>>>>>     }
>>>>>
>>>>>     #ifdef CONFIG_POWER_SUPPLY
>>>>>
>>>>> ? I'm curious to see if there's masked errors on your end since dwc3
>>>>> driver doesn't check for those.
>>>> root@...a:~# dmesg | grep -i -E 'warn|assert|error|tusb|dwc3'
>>>> 8250_mid: probe of 0000:00:04.0 failed with error -16
>>>> platform regulatory.0: Direct firmware load for regulatory.db failed
>>>> with error -2
>>>> brcmfmac mmc2:0001:1: Direct firmware load for
>>>> brcm/brcmfmac43340-sdio.Intel Corporation-Merrifield.bin failed with
>>>> error -2
>>>> sof-audio-pci-intel-tng 0000:00:0d.0: error: I/O region is too small.
>>>> sof-audio-pci-intel-tng 0000:00:0d.0: error: failed to probe DSP -19
>>>>
>>>>
>>>>>> This is done through configfs only when the switch is set to device mode.
>>>>> Sure, but can it be disabled? We are looking for unknown variables, so
>>>>> excluding this would be a reasonable thing to do.
>>>> It's not enabled until I flip the switch to device mode.
>>> OK to cut this back and forth short, I think it'd be easier to just
>>> ask you to run what I run. Here's vanilla v6.0 bzImage and initrd
>>> (built with your config + CONFIG_PHY_TUSB1210=y) I tested with
>> What do you mean by this? My config is with
>>
>> CONFIG_GENERIC_PHY=y
>> CONFIG_PHY_TUSB1210=y
>>
> $ cat config-6.0.0-edison-acpi-standard | grep 1210
> # CONFIG_PHY_TUSB1210 is not set
> $ md5sum config-6.0.0-edison-acpi-standard
> 3c989c708302c1f9e73c6113e71aed9d  config-6.0.0-edison-acpi-standard
>
> I had to manually enable it, that's what I meant by my comment.

Unbelievable, seems I uploaded the wrong config. I just double checked 
to see if any other differences:

scripts/diffconfig config-6.0.0-edison-acpi-standard-bad 
config-6.0.0-edison-acpi-standard-good
  GENERIC_PHY n -> y
  PHY_TUSB1210 n -> y

>
>>> https://drive.google.com/drive/folders/1H28AL1coPPZ2kLTYskDuDdWo-oE7DRPH?usp=sharing
>>> let's see how it behaves on your setup. There's also the U-Boot binary
>> Ok, it's getting weirder and weirder. The following is with my U-Boot
>> and your kernel/initrd
>>
>> 1) I placed them in /boot which is on my btrfs partition on the emmc (my
>> U-Boot has btrfs enabled)
>>
>> Linux kernel version 6.0.0-edison-acpi-standard
>> (andreysm@...tunefw-builder) #8 SMP PREEMPT_DYNAMIC Sat Oct 15 18:47:19
>> UTC 2022
>> Building boot_params at 0x00090000
>> Loading bzImage at address 100000 (12064480 bytes)
>> Initial RAM disk at linear address 0x06000000, size 25165824 bytes
>> Kernel command line: "quiet root=/dev/mmcblk0p8
>> rootflags=subvol=@,compress=lzo rootfstype=btrfs console=ttyS2,115200n8
>> earlyprintk=ttyS2,115200n8,keep loglevel=4 systemd.unit=multi-user.target"
>> Kernel loaded at 00100000, setup_base=00090000
>>
> You shouldn't be using root from you storage since:
>    a) the initrd I uploaded is self-containing, it doesn't need anything else

Yes I know. With the Yocto image we build our own that does switchroot.

Here I am inside your buildroot initrd, no fs from the emmc are mounted. 
According to dmesg btrfs module is loaded later then dwc3, and scans 
(finds) the btrfs partition in all cases without mounting.

>    b) your local data is another variable we don't want to introduce
>
> just "rootfstype=ramfs" should be enough for this and
>
>   root=/dev/mmcblk0p8 rootflags=subvol=@,compress=lzo rootfstype=btrfs
>
> should be dropped.

After some experimenting it appears "rootfstype=btrfs" causes the 
buildroot rootfs to fail probing tsub1210.

I think you should be able to reproduce this.

However, changing "rootfstype=ramfs" for my (yocto) image (which 
probably should be the right thing to do now I think about it) does not 
resolve the failing to probe tsub1210. Comparing the dmesg with the 
buildroot one shows that in my case a lot of stuff happens prior to dwc3:

raid6 does speed testing (this is used by btrfs)

btrfs is loaded

sdhci probed

acpi tables (for edison-arduino) loaded into configfs

external gpio muxes setup

finally xhci (tusb1210 is before this on the buildroot image)


>
>> Usb drive is not detected regardless booting with stick plugged or
>> plugging later on.
>>
>> # lsusb
>> Bus 001 Device 001: ID 1d6b:0002
>> Bus 002 Device 001: ID 1d6b:0003
>>
>> No TUSB1210 probed
>>
>> # dmesg | grep dwc3
>> #
>>
>> 2) I placed them in my vfat rescue partition
>>
>> Linux kernel version 6.0.0-edison-acpi-standard
>> (andreysm@...tunefw-builder) #8 SMP PREEMPT_DYNAMIC Sat Oct 15 18:47:19
>> UTC 2022
>> Building boot_params at 0x00090000
>> Loading bzImage at address 100000 (12064480 bytes)
>> Initial RAM disk at linear address 0x06000000, size 25165824 bytes
>> Kernel command line: "debugshell=0 tty1 console=ttyS2,115200n8
>> root=/dev/mmcblk0p7 rootfstype=vfat systemd.unit=multi-user.target"
>> Kernel loaded at 00100000, setup_base=00090000
>>
>> Usb drive is detected.
> Yep, that's exactly my point about extra variables. So it looks like
> something in your root btrfs partition is triggering this issue. I
> don't really know the contents of your root file system, so don't
> really have any suggestions there. Maybe old kernel modules are
> getting picked up? Or something else is interfering ¯\_(ツ)_/¯
>
>> # lsusb
>> Bus 001 Device 001: ID 1d6b:0002
>> Bus 001 Device 002: ID 125f:312b
>> Bus 002 Device 001: ID 1d6b:0003
>>
>> TUSB1210 probed
>>
>> # dmesg | grep dwc3
>> [    8.605845] tusb1210 dwc3.0.auto.ulpi: GPIO lookup for consumer reset
>> [    8.605876] tusb1210 dwc3.0.auto.ulpi: using ACPI for GPIO lookup
>> [    8.605927] tusb1210 dwc3.0.auto.ulpi: using lookup tables for GPIO
>> lookup
>> [    8.605941] tusb1210 dwc3.0.auto.ulpi: No GPIO consumer reset found
>> [    8.605956] tusb1210 dwc3.0.auto.ulpi: GPIO lookup for consumer cs
>> [    8.605970] tusb1210 dwc3.0.auto.ulpi: using ACPI for GPIO lookup
>> [    8.606011] tusb1210 dwc3.0.auto.ulpi: using lookup tables for GPIO
>> lookup
>> [    8.606024] tusb1210 dwc3.0.auto.ulpi: No GPIO consumer cs found
>> [    8.669317] tusb1210 dwc3.0.auto.ulpi: error -110 writing val 0x41 to
>> reg 0x80
>>
>> ## note: options debugshell, root and rootfstype are normally handled by
>> a script in my initrd, so I guess here noop.
>>
>>> I use in that folder in case you want to give it a try.
>>>
>>> Now on Merrifield dwc3_get_extcon() doesn't do anything but call
>>> extcon_get_extcon_dev() which doesn't touch any hardware or interact
>>> with other drivers, so assuming
>>>
>>>> So current v6.0 has: dwc3_get_extcon - dwc3_get_dr_mode - ... -
>>>> dwc3_core_init - .. - dwc3_core_init_mode (not working)
>>>>
>>>> I changed to: dwc3_get_dr_mode - dwc3_get_extcon - .. - dwc3_core_init -
>>>> .. - dwc3_core_init_mode (no change)
>>>>
>>>> Then to: dwc3_get_dr_mode - .. - dwc3_core_init - .. - dwc3_get_extcon -
>>>> dwc3_core_init_mode (works)
>>> still holds(did you double check that with vanilla v6.0?) the only
>> I didn't check
>>> difference that I can see is execution timings. It seems to me it's
>>> either an extra delay added by execution of  extcon_get_extcon_dev()
>>> (unlikely) or multiple partial probes that include dwc3_core_init()
>>> that change things. You can try to check the latter by adding an
>>> artificial probe deferral point after dwc3_core_init(). Something like
>>> (didn't test this):
>>>
>>> modified   drivers/usb/dwc3/core.c
>>> @@ -1860,6 +1860,10 @@ static int dwc3_probe(struct platform_device *pdev)
>>>     goto err3;
>>>
>>>     ret = dwc3_core_init(dwc);
>>> + static int deferral_counter = 0;
>>> + if (deferral_counter++ < 9) /* I counted 9 deferrals in my testing */
>>> + ret = -EPROBE_DEFER;
>>> +
>>>     if (ret) {
>>>     dev_err_probe(dev, ret, "failed to initialize core\n");
>>>     goto err4;
>> Not sure how you wanted this tested. So I assume on vanilla booting from
>> btrfs on eemc. It crashes but maybe the trace is usefull. After crash it
>> continues but no USB appears at all.
>>
> I think you'll have to experiment with that code placement to emulate
> a deferred probe for the old location of "get extcon".  I'd focus on
> figuring out the root filesystem variable first before trying to get
> this to work.

Yes, did that as described above. I think that "rootfstype=btrfs" causes 
some ordering issue, like as if xhci goes to soon. It goes before:

spi_master spi5: GPIO lookup for consumer cs

while tusb1210 when it does probe starts with:

tusb1210 dwc3.0.auto.ulpi: GPIO lookup for consumer reset

and xhci follow later.

> To be explicit, at this point I don't think the revert is really
> warranted. I'm also happy to reply/help you with suggestions, but you
> are going to have to start driving this.

I agree that reverting based on a "regression" can not be concluded here 
as dwc3 on merrifield never worked without an out-of-tree patch. And 
your patch makes that out-of-tree patch obsolete - that's a good thing.

But I do think your patch is exposing an older issue that makes dwc3 
sensitive to ordering. I would very much appreciate if you could try 
"rootfstype=btrfs" to reproduce. It think it would be a good thing to 
resolve it so that the effort here has not been for nothing.

My next step will be to move around the code placement as you suggest. 
(I can spend a few hours in the evenings only as this is not my day job, 
so explains if I'm a bit slow to respond here).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ