[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5dae4b62-24d4-4942-934a-38c548a2fdbc@gmail.com>
Date: Sun, 28 Apr 2024 23:07:36 +0200
From: Ferry Toth <fntoth@...il.com>
To: Hardik Gajjar <hgajjar@...adit-jv.com>,
Andy Shevchenko <andriy.shevchenko@...el.com>
Cc: gregkh@...uxfoundation.org, s.hauer@...gutronix.de, jonathanh@...dia.com,
linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org,
quic_linyyuan@...cinc.com, paul@...pouillou.net, quic_eserrao@...cinc.com,
erosca@...adit-jv.com
Subject: Re: [PATCH v4] usb: gadget: u_ether: Replace netif_stop_queue with
netif_device_detach
Hi,
Op 25-04-2024 om 23:27 schreef Ferry Toth:
> Hi,
>
> Op 17-04-2024 om 17:13 schreef Hardik Gajjar:
>> On Tue, Apr 16, 2024 at 04:48:32PM +0300, Andy Shevchenko wrote:
>>> On Thu, Apr 11, 2024 at 10:52:36PM +0200, Ferry Toth wrote:
>>>> Op 11-04-2024 om 18:39 schreef Andy Shevchenko:
>>>>> On Thu, Apr 11, 2024 at 04:26:37PM +0200, Hardik Gajjar wrote:
>>>>>> On Wed, Apr 10, 2024 at 08:37:42PM +0300, Andy Shevchenko wrote:
>>>>>>> On Sun, Apr 07, 2024 at 10:51:51PM +0200, Ferry Toth wrote:
>>>>>>>> Op 05-04-2024 om 13:38 schreef Hardik Gajjar:
>>>
>>> ...
>>>
>>>>>>>> Exactly. And this didn't happen before the 2 patches.
>>>>>>>>
>>>>>>>> To be precise: /sys/class/net/usb0 is not removed and it is a
>>>>>>>> link, the link
>>>>>>>> target
>>>>>>>> /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/net/usb0 no
>>>>>>>> longer exists
>>>>>> So, it means that the /sys/class/net/usb0 is present, but the
>>>>>> symlink is
>>>>>> broken. In that case, the dwc3 driver should recreate the device,
>>>>>> and the
>>>>>> symlink should become active again
>>>>
>>>> Yes, on first enabling gadget (when device mode is activated):
>>>>
>>>> root@...a:~# ls
>>>> /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
>>>> driver net power sound subsystem suspended uevent
>>>>
>>>> Then switching to host mode:
>>>>
>>>> root@...a:~# ls
>>>> /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
>>>> ls: cannot access
>>>> '/sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/': No
>>>> such file
>>>> or directory
>>>>
>>>> Then back to device mode:
>>>>
>>>> root@...a:~# ls
>>>> /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
>>>> driver power sound subsystem suspended uevent
>>>>
>>>> net is missing. But, network functions:
>>>>
>>>> root@...a:~# ping 10.42.0.1
>>>> PING 10.42.0.1 (10.42.0.1): 56 data bytes
>>>>
>>>> Mass storage device is created and removed each time as expected.
>>>
>>> So, what's the conclusion? Shall we move towards revert of those two
>>> changes?
>>
>>
>> As promised, I have the tested the this patch with the dwc3 gadget. I
>> could not reproduce
>> the issue.
>>
>> I can see the usb0 exist all the time and accessible regardless of the
>> role switching of the USB mode (peripheral <-> host)
>>
>> Following are the logs:
>> //Host to device
>>
>> console:/sys/bus/platform/devices/a800000.ssusb # echo "peripheral" >
>> mode
>> console:/sys/bus/platform/devices/a800000.ssusb # ls
>> a800000.dwc3/gadget/net/
>> usb0
>>
>> //device to host
>> console:/sys/bus/platform/devices/a800000.ssusb # echo "host" > mode
>> console:/sys/bus/platform/devices/a800000.ssusb # ls
>> a800000.dwc3/gadget/net/
>> usb0
>
> That is weird. When I switch to host mode (using the physical switch),
> the whole gadget directory is removed (now testing 6.9.0-rc5)
>
> Switching back to device mode, that gadget directory is recreated. And
> gadget/sound as well, but not gadget/net.
>
>> s
>> a800000.dwc3/gadget/net/usb0 <
>> addr_assign_type duplex phys_port_name
>> addr_len flags phys_switch_id
>> address gro_flush_timeout power
>> broadcast ifalias proto_down
>> carrier ifindex queues
>> carrier_changes iflink speed
>> carrier_down_count link_mode statistics
>> carrier_up_count mtu subsystem
>> dev_id name_assign_type tx_queue_len
>> dev_port netdev_group type
>> device operstate uevent
>> dormant phys_port_id waiting_for_supplier
>> console:/sys/bus/platform/devices/a800000.ssusb # ifconfig -a usb0
>> usb0 Link encap:Ethernet HWaddr 3a:8b:63:97:1a:9a
>> BROADCAST MULTICAST MTU:1500 Metric:1
>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:0 TX bytes:0
>>
>> console:/sys/bus/platform/devices/a800000.ssusb #
>>
>> I strongly advise against reverting the patch solely based on the
>> observed issue of removing the /sys/class/net/usb0 directory while the
>> usb0 interface remains available.
>
> There's more to it. I also mentioned that switching the role or
> unplugging the cable leaves the usb0 connection.
>
> I have while in host mode:
> root@...a:~# ifconfig -a usb0
> usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
> inet 10.42.0.221 netmask 255.255.255.0 broadcast 10.42.0.255
> inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>
>
> You don't see that because you didn't create a connection at all.
>
>> Instead, I recommend enabling FTRACE to trace the functions involved
>> and identify which faulty call is responsible for removing usb0.
>
> Switching from device -> host -> device:
>
> root@...a:~# trace-cmd record -p function_graph -l *gether_*
> plugin 'function_graph'
> Hit Ctrl^C to stop recording
> ^CCPU0 data recorded at offset=0x1c8000
> 188 bytes in size (4096 uncompressed)
> CPU1 data recorded at offset=0x1c9000
> 0 bytes in size (0 uncompressed)
> root@...a:~# trace-cmd report
> cpus=2
> irq/68-dwc3-725 [000] 514.575337: funcgraph_entry: #
> 2079.480 us | gether_disconnect();
> irq/68-dwc3-946 [000] 524.263731: funcgraph_entry: +
> 11.640 us | gether_disconnect();
> irq/68-dwc3-946 [000] 524.263743: funcgraph_entry: !
> 116.520 us | gether_connect();
> irq/68-dwc3-946 [000] 524.268029: funcgraph_entry: #
> 2057.260 us | gether_disconnect();
> irq/68-dwc3-946 [000] 524.270089: funcgraph_entry: !
> 109.000 us | gether_connect();
I tried to get a more useful trace:
root@...a:/sys/kernel/tracing# echo 'gether_*' > set_ftrace_filter
root@...a:/sys/kernel/tracing# echo 'eem_*' >> set_ftrace_filter
root@...a:/sys/kernel/tracing# echo function > current_tracer
root@...a:/sys/kernel/tracing# echo 'reset_config' >> set_ftrace_filter
-> switch to host mode then back to device
root@...a:/sys/kernel/tracing# cat trace
# tracer: function
#
# entries-in-buffer/entries-written: 53/53 #P:2
#
# _-----=> irqs-off/BH-disabled
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / _-=> migrate-disable
# |||| / delay
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |
irq/68-dwc3-523 [000] D..3. 133.990254: reset_config
<-__composite_disconnect
irq/68-dwc3-523 [000] D..3. 133.992274: eem_disable
<-reset_config
irq/68-dwc3-523 [000] D..3. 133.992276: gether_disconnect
<-reset_config
kworker/1:3-443 [001] ...1. 134.022453: eem_unbind
<-purge_configs_funcs
-> to device mode
kworker/1:3-443 [001] ...1. 148.630773: eem_bind
<-usb_add_function
irq/68-dwc3-734 [000] D..3. 149.155209: eem_set_alt
<-composite_setup
irq/68-dwc3-734 [000] D..3. 149.155215: gether_disconnect
<-eem_set_alt
irq/68-dwc3-734 [000] D..3. 149.155220: gether_connect
<-eem_set_alt
irq/68-dwc3-734 [000] D..3. 149.157287: eem_set_alt
<-composite_setup
irq/68-dwc3-734 [000] D..3. 149.157292: gether_disconnect
<-eem_set_alt
irq/68-dwc3-734 [000] D..3. 149.159338: gether_connect
<-eem_set_alt
irq/68-dwc3-734 [000] D..2. 149.239625: eem_unwrap <-rx_complete
..
I don't know where to look exactly. Any hints?
>
>> According to current kernel architecture of u_ether driver, only
>> gether_cleanup should remove the usb0 interface along with its kobject
>> and sysfs interface.
>> I suggest sharing the analysis here to understand why this practice is
>> not followed in your use case or driver ?
>
> Yes, I'll try to trace where that happens.
>
> Nevertheless, the disappearance of the net/usb0 directory seems
> harmless? But the usb: net device remaining after disconnect or role
> switch is not good, as the route remains.
>
> May be they are 2 separate problems. Could you try to reproduce what
> happens if you make eem connection and then unplug?
>
>> I am curious why the driver was developed without adhering to the
>> kernel's gadget architecture.
I don't know what you mean here. Which driver do you mean?
>>>
>>>>>> I have the dwc3 IP base usb controller, Let me check with this
>>>>>> patch and
>>>>>> share result here. May be we need some fix in dwc3
>>>> Would have been nice if someone could test on other controller as
>>>> well. But
>>>> another instance of dwc3 is also very welcome.
>>>>> It's quite possible, please test on your side.
>>>>> We are happy to test any fixes if you come up with.
>>>
>>> --
>>> With Best Regards,
>>> Andy Shevchenko
>>>
>>>
>
Powered by blists - more mailing lists