linux-kernel - Re: [PATCH v4 4/5] net: rnpgbe: Add basic mbx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3A381A779EB97B74+20250815063656.GA1148411@nic-Precision-5820-Tower>
Date: Fri, 15 Aug 2025 14:36:56 +0800
From: Yibo Dong <dong100@...se.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: andrew+netdev@...n.ch, davem@...emloft.net, edumazet@...gle.com,
	kuba@...nel.org, pabeni@...hat.com, horms@...nel.org,
	corbet@....net, gur.stavi@...wei.com, maddy@...ux.ibm.com,
	mpe@...erman.id.au, danishanwar@...com, lee@...ger.us,
	gongfan1@...wei.com, lorenzo@...nel.org, geert+renesas@...der.be,
	Parthiban.Veerasooran@...rochip.com, lukas.bulwahn@...hat.com,
	alexanderduyck@...com, richardcochran@...il.com,
	netdev@...r.kernel.org, linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 4/5] net: rnpgbe: Add basic mbx_fw support

On Fri, Aug 15, 2025 at 05:27:51AM +0200, Andrew Lunn wrote:
> >  struct mucse_hw {
> > +	u8 pfvfnum;
> >  	void __iomem *hw_addr;
> >  	void __iomem *ring_msix_base;
> >  	struct pci_dev *pdev;
> > +	u32 fw_version;
> > +	u32 axi_mhz;
> > +	u32 bd_uid;
> >  	enum rnpgbe_hw_type hw_type;
> >  	struct mucse_dma_info dma;
> >  	struct mucse_eth_info eth;
> 
> Think about alignment of these structures. The compiler is going to
> put in padding after the u8 phvfnum. The 3 pointers are all the same
> size, no padding. The u32 probably go straight after the pointers. The
> enum it might represent as a single byte, so there is will be padding
> before dma. So consider moving the u8 next to the enum.
> 
> pahole(1) will tell you what the compiler really did, but you will
> find more experienced engineers try to minimise padding, or
> deliberately group hot items on a cache line, and document that.
> 

Got it, I will update this.

> > +static int mucse_fw_send_cmd_wait(struct mucse_hw *hw,
> > +				  struct mbx_fw_cmd_req *req,
> > +				  struct mbx_fw_cmd_reply *reply)
> > +{
> > +	int len = le16_to_cpu(req->datalen) + MBX_REQ_HDR_LEN;
> > +	int retry_cnt = 3;
> > +	int err;
> > +
> > +	err = mutex_lock_interruptible(&hw->mbx.lock);
> > +	if (err)
> > +		return err;
> > +	err = hw->mbx.ops->write_posted(hw, (u32 *)req,
> > +					L_WD(len));
> 
> This L_WD macro is not nice. It seems like a place bugs will be
> introduced, forgetting to call it here. Why not have write_posted()
> take bytes, and have the lowest layer convert to 32 bit words.
> 
> It also seems odd you are adding MBX_REQ_HDR_LEN here but not that
> actual header. Why not increase the length at the point the header is
> actually added? Keep stuff logically together.
> 

Ok, I will have write_posted() take bytes like you suggestion, and try
to improve len here.

> > +	if (err)
> > +		goto quit;
> > +	do {
> > +		err = hw->mbx.ops->read_posted(hw, (u32 *)reply,
> > +					       L_WD(sizeof(*reply)));
> > +		if (err)
> > +			goto quit;
> > +	} while (--retry_cnt >= 0 && reply->opcode != req->opcode);
> > +quit:
> 
> Maybe add some documentation about what is actually going on here. I
> assume you are trying to get the driver and firmware into sync after
> one or other has crashed, burned, and rebooted. You need to flush out
> old replies. You allow up to three old replies to be in the queue, and
> then give up. Since you don't retry the write, you don't expect writes
> to be lost?
> 
> 

write_posted return 0 only after fw acked it, so no need to write.
It is a sync mechanism, tries to get the correct response opcode.

Maybe comment link this?
/* write_posted return 0 means fw has received request, wait for
 * the expect opcode reply with 'retry_cnt' times.
 */

> > +	mutex_unlock(&hw->mbx.lock);
> > +	if (!err && retry_cnt < 0)
> > +		return -ETIMEDOUT;
> > +	if (!err && reply->error_code)
> > +		return -EIO;
> > +	return err;
> > +}
> > +
> > +/**
> > + * mucse_fw_get_capability - Get hw abilities from fw
> > + * @hw: pointer to the HW structure
> > + * @abil: pointer to the hw_abilities structure
> > + *
> > + * mucse_fw_get_capability tries to get hw abilities from
> > + * hw.
> > + *
> > + * @return: 0 on success, negative on failure
> > + **/
> > +static int mucse_fw_get_capability(struct mucse_hw *hw,
> > +				   struct hw_abilities *abil)
> > +{
> > +	struct mbx_fw_cmd_reply reply = {};
> > +	struct mbx_fw_cmd_req req = {};
> > +	int err;
> > +
> > +	build_phy_abilities_req(&req, &req);
> 
> Passing the same parameter twice? Is that correct? It looks very odd.
> 

Got it, I will fix it.

> > +/**
> > + * mbx_cookie_zalloc - Alloc a cookie structure
> > + * @priv_len: private length for this cookie
> > + *
> > + * @return: cookie structure on success
> > + **/
> > +static struct mbx_req_cookie *mbx_cookie_zalloc(int priv_len)
> > +{
> > +	struct mbx_req_cookie *cookie;
> > +
> > +	cookie = kzalloc(struct_size(cookie, priv, priv_len), GFP_KERNEL);
> > +	if (cookie) {
> > +		cookie->timeout_jiffes = 30 * HZ;
> > +		cookie->magic = COOKIE_MAGIC;
> > +		cookie->priv_len = priv_len;
> > +	}
> > +	return cookie;
> 
> > +struct mbx_req_cookie {
> > +	int magic;
> > +#define COOKIE_MAGIC 0xCE
> > +	cookie_cb cb;
> > +	int timeout_jiffes;
> > +	int errcode;
> > +	wait_queue_head_t wait;
> > +	int done;
> > +	int priv_len;
> > +	char priv[];
> > +};
> 
> 
> Using struct_size() makes me think this is supposed to be a flexible
> array? I've never used them myself, but shouldn't be some markup so
> the compiler knows priv_len is the len of priv?
> 

Maybe link this?
struct mbx_req_cookie {
....
	int priv_len;
	char priv[] __counted_by(priv_len);
}

> > +static int mucse_mbx_fw_post_req(struct mucse_hw *hw,
> > +				 struct mbx_fw_cmd_req *req,
> > +				 struct mbx_req_cookie *cookie)
> > +{
> > +	int len = le16_to_cpu(req->datalen) + MBX_REQ_HDR_LEN;
> > +	int err;
> > +
> > +	cookie->errcode = 0;
> > +	cookie->done = 0;
> > +	init_waitqueue_head(&cookie->wait);
> > +	err = mutex_lock_interruptible(&hw->mbx.lock);
> > +	if (err)
> > +		return err;
> > +	err = mucse_write_mbx(hw, (u32 *)req,
> > +			      L_WD(len));
> > +	if (err) {
> > +		mutex_unlock(&hw->mbx.lock);
> 
> Please try to put the unlock at the end of the function, with a goto
> on error.
> 

Got it, I will fix it.

> > +		return err;
> > +	}
> > +	do {
> > +		err = wait_event_interruptible_timeout(cookie->wait,
> > +						       cookie->done == 1,
> > +						       cookie->timeout_jiffes);
> > +	} while (err == -ERESTARTSYS);
> 
> This needs a comment, because i don't understand it.
> 
> 

wait_event_interruptible_timeout return -ERESTARTSYS if it was interrupted
by a signal, which will cause misjudgement about cookie->done is timeout. 
In this case, just wait for timeout.
Maybe comment link this?
/* If it was interrupted by a signal (-ERESTARTSYS), it is not true timeout,
 * just wait again.
 */

> > +	mutex_unlock(&hw->mbx.lock);
> > +	if (!err)
> > +		err = -ETIME;
> 
> I _think_ ETIMEDOUT would be more normal.
> 

Got it, I will update it. 

> > +	else
> > +		err = 0;
> > +	if (!err && cookie->errcode)
> > +		err = cookie->errcode;
> > +
> > +	return err;
> > +}
> > +int mucse_fw_get_macaddr(struct mucse_hw *hw, int pfvfnum,
> > +			 u8 *mac_addr,
> > +			 int lane)
> > +{
> > +	struct mbx_fw_cmd_reply reply = {};
> > +	struct mbx_fw_cmd_req req = {};
> > +	int err;
> > +
> > +	build_get_macaddress_req(&req, 1 << lane, pfvfnum, &req);
> > +	err = mucse_fw_send_cmd_wait(hw, &req, &reply);
> > +	if (err)
> > +		return err;
> > +
> > +	if ((1 << lane) & le32_to_cpu(reply.mac_addr.lanes))
> 
> BIT(). And normally the & would be the other way around.
> 

Maybe changed link this?
...
if (le32_to_cpu(reply.mac_addr.ports) & BIT(lane))
...

> What exactly is a lane here? Normally we would think of a lane is
> -KR4, 4 SERDES lanes making one port. But the MAC address is a
> property of the port, not the lane within a port.
> 

lane is the valid bit in 'reply.mac_addr.ports'.
Maybe change it to 'port', that is more appropriate.

> > +		memcpy(mac_addr, reply.mac_addr.addrs[lane].mac, 6);
> 
> There is a macro for 6, please use it.
> 

Got it, I will use ETH_ALEN.

> > +struct hw_abilities {
> > +	u8 link_stat;
> > +	u8 lane_mask;
> > +	__le32 speed;
> > +	__le16 phy_type;
> > +	__le16 nic_mode;
> > +	__le16 pfnum;
> 
> Another example of a bad structure layout. It would of been much
> better to put the two u8 after speed.
> 
> > +} __packed;
> 
> And because this is packed, and badly aligned, you are forcing the
> compiler to do a lot more work accessing these members.
> 

Yes, It is bad. But FW use this define, I can only follow the define...
Maybe I can add comment here?
/* Must follow FW define here */ 

> > +
> > +static inline void ability_update_host_endian(struct hw_abilities *abi)
> > +{
> > +	u32 host_val = le32_to_cpu(abi->ext_ability);
> > +
> > +	abi->e_host = *(typeof(abi->e_host) *)&host_val;
> > +}
> 
> Please add a comment what this is doing, it is not obvious.
> 
> 

Maybe link this?
/* Converts the little-endian ext_ability field to host byte order,
 * then copies the value into the e_host field by reinterpreting the
 * memory as the type of e_host (likely a bitfield or structure that
 * represents the extended abilities in a host-friendly format).
 */

> > +
> > +#define FLAGS_DD BIT(0)
> > +#define FLAGS_ERR BIT(2)
> > +
> > +/* Request is in little-endian format. Big-endian systems should be considered */
> 
> So the code now sparse clean? If it is, you can probably remove this
> comment.
> 

Yes, sparse clean. I will remove this.

> > +static inline void build_phy_abilities_req(struct mbx_fw_cmd_req *req,
> > +					   void *cookie)
> > +{
> > +	req->flags = 0;
> > +	req->opcode = cpu_to_le16(GET_PHY_ABALITY);
> > +	req->datalen = 0;
> > +	req->reply_lo = 0;
> > +	req->reply_hi = 0;
> > +	req->cookie = cookie;
> > +}
> > +
> > +static inline void build_ifinsmod(struct mbx_fw_cmd_req *req,
> > +				  unsigned int lane,
> > +				  int status)
> > +{
> > +	req->flags = 0;
> > +	req->opcode = cpu_to_le16(DRIVER_INSMOD);
> > +	req->datalen = cpu_to_le16(sizeof(req->ifinsmod));
> > +	req->cookie = NULL;
> > +	req->reply_lo = 0;
> > +	req->reply_hi = 0;
> > +	req->ifinsmod.lane = cpu_to_le32(lane);
> > +	req->ifinsmod.status = cpu_to_le32(status);
> > +}
> > +
> > +static inline void build_reset_phy_req(struct mbx_fw_cmd_req *req,
> > +				       void *cookie)
> > +{
> > +	req->flags = 0;
> > +	req->opcode = cpu_to_le16(RESET_PHY);
> > +	req->datalen = 0;
> > +	req->reply_lo = 0;
> > +	req->reply_hi = 0;
> > +	req->cookie = cookie;
> > +}
> > +
> > +static inline void build_get_macaddress_req(struct mbx_fw_cmd_req *req,
> > +					    int lane_mask, int pfvfnum,
> > +					    void *cookie)
> > +{
> > +	req->flags = 0;
> > +	req->opcode = cpu_to_le16(GET_MAC_ADDRES);
> > +	req->datalen = cpu_to_le16(sizeof(req->get_mac_addr));
> > +	req->cookie = cookie;
> > +	req->reply_lo = 0;
> > +	req->reply_hi = 0;
> > +	req->get_mac_addr.lane_mask = cpu_to_le32(lane_mask);
> > +	req->get_mac_addr.pfvf_num = cpu_to_le32(pfvfnum);
> > +}
> 
> These are rather large for inline functions in a header. Please move
> them into a .c file.
> 
> 	Andrew
> 

Got it. I will update it.

Thanks for your feedback.