[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8041106f-0be0-4ed9-990e-1f62902b30e9@gmail.com>
Date: Tue, 30 Apr 2024 21:40:55 +0200
From: Ferry Toth <fntoth@...il.com>
To: Hardik Gajjar <hgajjar@...adit-jv.com>
Cc: Andy Shevchenko <andriy.shevchenko@...el.com>,
gregkh@...uxfoundation.org, s.hauer@...gutronix.de, jonathanh@...dia.com,
linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org,
quic_linyyuan@...cinc.com, paul@...pouillou.net, quic_eserrao@...cinc.com,
erosca@...adit-jv.com
Subject: Re: [PATCH v4] usb: gadget: u_ether: Replace netif_stop_queue with
netif_device_detach
Hi,
Op 30-04-2024 om 17:32 schreef Hardik Gajjar:
> On Sun, Apr 28, 2024 at 11:07:36PM +0200, Ferry Toth wrote:
>> Hi,
>>
>> Op 25-04-2024 om 23:27 schreef Ferry Toth:
>>> Hi,
>>>
>>> Op 17-04-2024 om 17:13 schreef Hardik Gajjar:
>>>> On Tue, Apr 16, 2024 at 04:48:32PM +0300, Andy Shevchenko wrote:
>>>>> On Thu, Apr 11, 2024 at 10:52:36PM +0200, Ferry Toth wrote:
>>>>>> Op 11-04-2024 om 18:39 schreef Andy Shevchenko:
>>>>>>> On Thu, Apr 11, 2024 at 04:26:37PM +0200, Hardik Gajjar wrote:
>>>>>>>> On Wed, Apr 10, 2024 at 08:37:42PM +0300, Andy Shevchenko wrote:
>>>>>>>>> On Sun, Apr 07, 2024 at 10:51:51PM +0200, Ferry Toth wrote:
>>>>>>>>>> Op 05-04-2024 om 13:38 schreef Hardik Gajjar:
>>>>>
>>>>> ...
>>>>>
>>>>>>>>>> Exactly. And this didn't happen before the 2 patches.
>>>>>>>>>>
>>>>>>>>>> To be precise: /sys/class/net/usb0 is not
>>>>>>>>>> removed and it is a link, the link
>>>>>>>>>> target /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/net/usb0
>>>>>>>>>> no
>>>>>>>>>> longer exists
>>>>>>>> So, it means that the /sys/class/net/usb0 is
>>>>>>>> present, but the symlink is
>>>>>>>> broken. In that case, the dwc3 driver should
>>>>>>>> recreate the device, and the
>>>>>>>> symlink should become active again
>>>>>>
>>>>>> Yes, on first enabling gadget (when device mode is activated):
>>>>>>
>>>>>> root@...a:~# ls
>>>>>> /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
>>>>>> driver net power sound subsystem suspended uevent
>>>>>>
>>>>>> Then switching to host mode:
>>>>>>
>>>>>> root@...a:~# ls
>>>>>> /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
>>>>>> ls: cannot access
>>>>>> '/sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/':
>>>>>> No such file
>>>>>> or directory
>>>>>>
>>>>>> Then back to device mode:
>>>>>>
>>>>>> root@...a:~# ls
>>>>>> /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
>>>>>> driver power sound subsystem suspended uevent
>>>>>>
>>>>>> net is missing. But, network functions:
>>>>>>
>>>>>> root@...a:~# ping 10.42.0.1
>>>>>> PING 10.42.0.1 (10.42.0.1): 56 data bytes
>>>>>>
>>>>>> Mass storage device is created and removed each time as expected.
>>>>>
>>>>> So, what's the conclusion? Shall we move towards revert of those
>>>>> two changes?
>>>>
>>>>
>>>> As promised, I have the tested the this patch with the dwc3 gadget.
>>>> I could not reproduce
>>>> the issue.
>>>>
>>>> I can see the usb0 exist all the time and accessible regardless of
>>>> the role switching of the USB mode (peripheral <-> host)
>>>>
>>>> Following are the logs:
>>>> //Host to device
>>>>
>>>> console:/sys/bus/platform/devices/a800000.ssusb # echo "peripheral"
>>>>> mode
>>>> console:/sys/bus/platform/devices/a800000.ssusb # ls
>>>> a800000.dwc3/gadget/net/
>>>> usb0
>>>>
>>>> //device to host
>>>> console:/sys/bus/platform/devices/a800000.ssusb # echo "host" > mode
>>>> console:/sys/bus/platform/devices/a800000.ssusb # ls
>>>> a800000.dwc3/gadget/net/
>>>> usb0
>>>
>>> That is weird. When I switch to host mode (using the physical switch),
>>> the whole gadget directory is removed (now testing 6.9.0-rc5)
>>>
>>> Switching back to device mode, that gadget directory is recreated. And
>>> gadget/sound as well, but not gadget/net.
>>>
>>>> s a800000.dwc3/gadget/net/usb0
>>>> <
>>>> addr_assign_type duplex phys_port_name
>>>> addr_len flags phys_switch_id
>>>> address gro_flush_timeout power
>>>> broadcast ifalias proto_down
>>>> carrier ifindex queues
>>>> carrier_changes iflink speed
>>>> carrier_down_count link_mode statistics
>>>> carrier_up_count mtu subsystem
>>>> dev_id name_assign_type tx_queue_len
>>>> dev_port netdev_group type
>>>> device operstate uevent
>>>> dormant phys_port_id waiting_for_supplier
>>>> console:/sys/bus/platform/devices/a800000.ssusb # ifconfig -a usb0
>>>> usb0 Link encap:Ethernet HWaddr 3a:8b:63:97:1a:9a
>>>> BROADCAST MULTICAST MTU:1500 Metric:1
>>>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>> collisions:0 txqueuelen:1000
>>>> RX bytes:0 TX bytes:0
>>>>
>>>> console:/sys/bus/platform/devices/a800000.ssusb #
>>>>
>>>> I strongly advise against reverting the patch solely based on the
>>>> observed issue of removing the /sys/class/net/usb0 directory while
>>>> the usb0 interface remains available.
>>>
>>> There's more to it. I also mentioned that switching the role or
>>> unplugging the cable leaves the usb0 connection.
>>>
>>> I have while in host mode:
>>> root@...a:~# ifconfig -a usb0
>>> usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu 1500
>>> inet 10.42.0.221 netmask 255.255.255.0 broadcast 10.42.0.255
>>> inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid 0x20<link>
>>>
>>>
>>> You don't see that because you didn't create a connection at all.
>>>
>>>> Instead, I recommend enabling FTRACE to trace the functions involved
>>>> and identify which faulty call is responsible for removing usb0.
>>>
>>> Switching from device -> host -> device:
>>>
>>> root@...a:~# trace-cmd record -p function_graph -l *gether_*
>>> plugin 'function_graph'
>>> Hit Ctrl^C to stop recording
>>> ^CCPU0 data recorded at offset=0x1c8000
>>> 188 bytes in size (4096 uncompressed)
>>> CPU1 data recorded at offset=0x1c9000
>>> 0 bytes in size (0 uncompressed)
>>> root@...a:~# trace-cmd report
>>> cpus=2
>>> irq/68-dwc3-725 [000] 514.575337: funcgraph_entry: #
>>> 2079.480 us | gether_disconnect();
>>> irq/68-dwc3-946 [000] 524.263731: funcgraph_entry: +
>>> 11.640 us | gether_disconnect();
>>> irq/68-dwc3-946 [000] 524.263743: funcgraph_entry: !
>>> 116.520 us | gether_connect();
>>> irq/68-dwc3-946 [000] 524.268029: funcgraph_entry: #
>>> 2057.260 us | gether_disconnect();
>>> irq/68-dwc3-946 [000] 524.270089: funcgraph_entry: !
>>> 109.000 us | gether_connect();
>>
>> I tried to get a more useful trace:
>> root@...a:/sys/kernel/tracing# echo 'gether_*' > set_ftrace_filter
>> root@...a:/sys/kernel/tracing# echo 'eem_*' >> set_ftrace_filter
>> root@...a:/sys/kernel/tracing# echo function > current_tracer
>> root@...a:/sys/kernel/tracing# echo 'reset_config' >> set_ftrace_filter
>> -> switch to host mode then back to device
>> root@...a:/sys/kernel/tracing# cat trace
>> # tracer: function
>> #
>> # entries-in-buffer/entries-written: 53/53 #P:2
>> #
>> # _-----=> irqs-off/BH-disabled
>> # / _----=> need-resched
>> # | / _---=> hardirq/softirq
>> # || / _--=> preempt-depth
>> # ||| / _-=> migrate-disable
>> # |||| / delay
>> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
>> # | | | ||||| | |
>> irq/68-dwc3-523 [000] D..3. 133.990254: reset_config
>> <-__composite_disconnect
>> irq/68-dwc3-523 [000] D..3. 133.992274: eem_disable
>> <-reset_config
>> irq/68-dwc3-523 [000] D..3. 133.992276: gether_disconnect
>> <-reset_config
>> kworker/1:3-443 [001] ...1. 134.022453: eem_unbind
>> <-purge_configs_funcs
>>
>> -> to device mode
>>
>> kworker/1:3-443 [001] ...1. 148.630773: eem_bind
>> <-usb_add_function
>> irq/68-dwc3-734 [000] D..3. 149.155209: eem_set_alt
>> <-composite_setup
>> irq/68-dwc3-734 [000] D..3. 149.155215: gether_disconnect
>> <-eem_set_alt
>> irq/68-dwc3-734 [000] D..3. 149.155220: gether_connect
>> <-eem_set_alt
>> irq/68-dwc3-734 [000] D..3. 149.157287: eem_set_alt
>> <-composite_setup
>> irq/68-dwc3-734 [000] D..3. 149.157292: gether_disconnect
>> <-eem_set_alt
>> irq/68-dwc3-734 [000] D..3. 149.159338: gether_connect
>> <-eem_set_alt
>> irq/68-dwc3-734 [000] D..2. 149.239625: eem_unwrap <-rx_complete
>> ...
>>
>> I don't know where to look exactly. Any hints?
>
> do you see anything related to gether_cleanup() after eem_unbind() ?
Nope. It's a pitty that the trace formatting got messed up above. But as
you can see I traced gether_* and eem_*. After eem_unbind no traced
function is called, until I flip the switch to device mode.
The ... at the end is where I cut uninteresting eem_unwrap <-rx_complete
and eem_wrap <-eth_start_xmit lines.
> If not then, you may try to enable tracing of TCP/IP stack and network side to check who deleting the sysfs entry
Yes, that's a vast amount of functions to trace. And I don't see how
that would be related to the patch we're discussing here. I was hoping
for a little more targeted hint.
You may recall the whole issue did not occur before this patch got applied.
> Hardik
>
>
>>
>>>
>>>> According to current kernel architecture of u_ether driver, only
>>>> gether_cleanup should remove the usb0 interface along with its
>>>> kobject and sysfs interface.
>>>> I suggest sharing the analysis here to understand why this practice
>>>> is not followed in your use case or driver ?
>>>
>>> Yes, I'll try to trace where that happens.
>>>
>>> Nevertheless, the disappearance of the net/usb0 directory seems
>>> harmless? But the usb: net device remaining after disconnect or role
>>> switch is not good, as the route remains.
>>>
>>> May be they are 2 separate problems. Could you try to reproduce what
>>> happens if you make eem connection and then unplug?
>>>
>>>> I am curious why the driver was developed without adhering to the
>>>> kernel's gadget architecture.
>>
>> I don't know what you mean here. Which driver do you mean?
>>
>>>>>
>>>>>>>> I have the dwc3 IP base usb controller, Let me check
>>>>>>>> with this patch and
>>>>>>>> share result here. May be we need some fix in dwc3
>>>>>> Would have been nice if someone could test on other
>>>>>> controller as well. But
>>>>>> another instance of dwc3 is also very welcome.
>>>>>>> It's quite possible, please test on your side.
>>>>>>> We are happy to test any fixes if you come up with.
>>>>>
>>>>> --
>>>>> With Best Regards,
>>>>> Andy Shevchenko
>>>>>
>>>>>
>>>
Powered by blists - more mailing lists