lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20240206233844.GA143470@workstation.local>
Date: Wed, 7 Feb 2024 08:38:44 +0900
From: Takashi Sakamoto <o-takashi@...amocchi.jp>
To: Adam Goldman <adamg@...ox.com>
Cc: linux1394-devel@...ts.sourceforge.net, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] firewire: core: send bus reset promptly on gap count
 error

Hi,

Thanks for the patch. I applied it to for-linus branch and will send it
for v6.8-rc4 in this week. Thanks for your long patience in the review
process.

On Sun, Feb 04, 2024 at 09:21:14PM -0800, Adam Goldman wrote:
> If we are bus manager and the bus has inconsistent gap counts, send a 
> bus reset immediately instead of trying to read the root node's config 
> ROM first. Otherwise, we could spend a lot of time trying to read the 
> config ROM but never succeeding.
> 
> This eliminates a 50+ second delay before the FireWire bus is usable after 
> a newly connected device is powered on in certain circumstances.
> 
> The delay occurs if a gap count inconsistency occurs, we are not the root 
> node, and we become bus manager. One scenario that causes this is with a TI 
> XIO2213B OHCI, the first time a Sony DSR-25 is powered on after being 
> connected to the FireWire cable. In this configuration, the Linux box will 
> not receive the initial PHY configuration packet sent by the DSR-25 as IRM, 
> resulting in the DSR-25 having a gap count of 44 while the Linux box has a 
> gap count of 63.
> 
> FireWire devices have a gap count parameter, which is set to 63 on power-up 
> and can be changed with a PHY configuration packet. This determines the 
> duration of the subaction and arbitration gaps. For reliable communication, 
> all nodes on a FireWire bus must have the same gap count.
> 
> A node may have zero or more of the following roles: root node, bus manager 
> (BM), isochronous resource manager (IRM), and cycle master. Unless a root 
> node was forced with a PHY configuration packet, any node might become root 
> node after a bus reset. Only the root node can become cycle master. If the 
> root node is not cycle master capable, the BM or IRM should force a change 
> of root node.
> 
> After a bus reset, each node sends a self-ID packet, which contains its 
> current gap count. A single bus reset does not change the gap count, but 
> two bus resets in a row will set the gap count to 63. Because a consistent 
> gap count is required for reliable communication, IEEE 1394a-2000 requires 
> that the bus manager generate a bus reset if it detects that the gap count 
> is inconsistent.
> 
> When the gap count is inconsistent, build_tree() will notice this after the 
> self identification process. It will set card->gap_count to the invalid 
> value 0. If we become bus master, this will force bm_work() to send a bus 
> reset when it performs gap count optimization.
> 
> After a bus reset, there is no bus manager. We will almost always try to 
> become bus manager. Once we become bus manager, we will first determine 
> whether the root node is cycle master capable. Then, we will determine if 
> the gap count should be changed. If either the root node or the gap count 
> should be changed, we will generate a bus reset.
> 
> To determine if the root node is cycle master capable, we read its 
> configuration ROM. bm_work() will wait until we have finished trying to 
> read the configuration ROM.
> 
> However, an inconsistent gap count can make this take a long time. 
> read_config_rom() will read the first few quadlets from the config ROM. Due 
> to the gap count inconsistency, eventually one of the reads will time out. 
> When read_config_rom() fails, fw_device_init() calls it again until 
> MAX_RETRIES is reached. This takes 50+ seconds.
> 
> Once we give up trying to read the configuration ROM, bm_work() will wake 
> up, assume that the root node is not cycle master capable, and do a bus 
> reset. Hopefully, this will resolve the gap count inconsistency.
> 
> This change makes bm_work() check for an inconsistent gap count before 
> waiting for the root node's configuration ROM. If the gap count is 
> inconsistent, bm_work() will immediately do a bus reset. This eliminates 
> the 50+ second delay and rapidly brings the bus to a working state.
> 
> I considered that if the gap count is inconsistent, a PHY configuration 
> packet might not be successful, so it could be desirable to skip the PHY 
> configuration packet before the bus reset in this case. However, IEEE 
> 1394a-2000 and IEEE 1394-2008 say that the bus manager may transmit a PHY 
> configuration packet before a bus reset when correcting a gap count error. 
> Since the standard endorses this, I decided it's safe to retain the PHY 
> configuration packet transmission.
> 
> Normally, after a topology change, we will reset the bus a maximum of 5 
> times to change the root node and perform gap count optimization. However, 
> if there is a gap count inconsistency, we must always generate a bus reset. 
> Otherwise the gap count inconsistency will persist and communication will 
> be unreliable. For that reason, if there is a gap count inconstency, we 
> generate a bus reset even if we already reached the 5 reset limit.
> 
> Signed-off-by: Adam Goldman <adamg@...ox.com>
> Link: https://sourceforge.net/p/linux1394/mailman/message/58727806/
> ---


Takashi Sakamoto

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ