linux-kernel - Re: [PATCH v8 3/8] thunderbolt: Communication with the ICM (firmware)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fe0f0a6f-488b-bbc0-8987-a9c47d9ed9b9@kernel.org>
Date:   Thu, 1 Dec 2016 17:21:01 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Amir Levy <amir.jer.levy@...el.com>, gregkh@...uxfoundation.org
Cc:     andreas.noever@...il.com, bhelgaas@...gle.com, corbet@....net,
        linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
        netdev@...r.kernel.org, linux-doc@...r.kernel.org,
        mario_limonciello@...l.com, thunderbolt-linux@...el.com,
        mika.westerberg@...el.com, tomas.winkler@...el.com,
        xiong.y.zhang@...el.com
Subject: Re: [PATCH v8 3/8] thunderbolt: Communication with the ICM (firmware)

On 09/28/2016 07:44 AM, Amir Levy wrote:
> This patch provides the communication protocol between the
> Intel Connection Manager(ICM) firmware that is operational in the
> Thunderbolt controller in non-Apple hardware.
> The ICM firmware-based controller is used for establishing and maintaining
> the Thunderbolt Networking connection - we need to be able to communicate
> with it.

I'm a bit late to the party, but here goes.  I have two big questions:

1. Why is this using netlink at all?  A system has zero or more 
Thunderbolt controllers, they're probed just like any other PCI devices 
(by nhi_probe() if I'm understanding correctly), they'll have nodes in 
sysfs, etc.  Shouldn't there be a simple char device per Thunderbolt 
controller that a daemon can connect to?  This will clean up lots of things:

a) You can actually enforce one-daemon-at-a-time in a very natural way. 
Your current code seems to try, but it's rather buggy.  Your 
subscription count is a guess, your unsubscribe is entirely unchecked, 
and you are entirely unable to detect if a daemon crashes AFAICT.

b) You won't need all of the complexity that's currently there to figure 
out *which* Thunderbolt device a daemon is talking to.

c) You can use regular ioctl passing *structs* instead of netlink attrs. 
  There's nothing wrong with netlink attrs, except that your driver 
seems to have a whole lot of boilerplate that just converts back and 
forth to regular structures.

d) The userspace code that does stuff like "send message, wait 150ms, 
receive reply, complain if no reply" goes away because ioctl is 
synchronous.  (Or you can use read and write, but it's still simpler.)

e) You could have one daemon per Thunderbolt device if you were so inclined.

f) You get privilege separation in userspace.  Creating a netlink socket 
and dropping privilege is busted^Winteresting.  Opening a device node 
and dropping privilege works quite nicely.

2. Why do you need a daemon anyway.  Functionally, what exactly does it 
do?  (Okay, I get that it seems to talk to a giant pile of code running 
in SMM, and I get that Intel, for some bizarre reason, wants everyone 
except Apple to use this code in SMM, and that Apple (for entirely 
understandable reasons) turned it off, but that's beside the point. 
What does the user code do that's useful and that the kernel can't do 
all by itself?  The only really interesting bit I can see is the part 
that approves PCI devices.



I'm not going to review this in detail, but here's a tiny bit:

> +static int nhi_genl_unsubscribe(__always_unused struct sk_buff *u_skb,
> +				__always_unused struct genl_info *info)
> +{
> +	atomic_dec_if_positive(&subscribers);
> +
> +	return 0;
> +}
> +

This, for example, is really quite buggy.



This entire function here:

> +static int nhi_genl_query_information(__always_unused struct sk_buff *u_skb,
> +				      struct genl_info *info)
> +{
> +	struct tbt_nhi_ctxt *nhi_ctxt;
> +	struct sk_buff *skb;
> +	bool msg_too_long;
> +	int res = -ENODEV;
> +	u32 *msg_head;
> +
> +	if (!info || !info->userhdr)
> +		return -EINVAL;
> +
> +	skb = genlmsg_new(NLMSG_ALIGN(nhi_genl_family.hdrsize) +
> +			  nla_total_size(sizeof(DRV_VERSION)) +
> +			  nla_total_size(sizeof(nhi_ctxt->nvm_ver_offset)) +
> +			  nla_total_size(sizeof(nhi_ctxt->num_ports)) +
> +			  nla_total_size(sizeof(nhi_ctxt->dma_port)) +
> +			  nla_total_size(0),	/* nhi_ctxt->support_full_e2e */
> +			  GFP_KERNEL);
> +	if (!skb)
> +		return -ENOMEM;
> +
> +	msg_head = genlmsg_put_reply(skb, info, &nhi_genl_family, 0,
> +				     NHI_CMD_QUERY_INFORMATION);
> +	if (!msg_head) {
> +		res = -ENOMEM;
> +		goto genl_put_reply_failure;
> +	}
> +
> +	if (mutex_lock_interruptible(&controllers_list_mutex)) {
> +		res = -ERESTART;
> +		goto genl_put_reply_failure;
> +	}
> +
> +	nhi_ctxt = nhi_search_ctxt(*(u32 *)info->userhdr);
> +	if (nhi_ctxt && !nhi_ctxt->d0_exit) {
> +		*msg_head = nhi_ctxt->id;
> +
> +		msg_too_long = !!nla_put_string(skb, NHI_ATTR_DRV_VERSION,
> +						DRV_VERSION);
> +
> +		msg_too_long = msg_too_long ||
> +			       nla_put_u16(skb, NHI_ATTR_NVM_VER_OFFSET,
> +					   nhi_ctxt->nvm_ver_offset);
> +
> +		msg_too_long = msg_too_long ||
> +			       nla_put_u8(skb, NHI_ATTR_NUM_PORTS,
> +					  nhi_ctxt->num_ports);
> +
> +		msg_too_long = msg_too_long ||
> +			       nla_put_u8(skb, NHI_ATTR_DMA_PORT,
> +					  nhi_ctxt->dma_port);
> +
> +		if (msg_too_long) {
> +			res = -EMSGSIZE;
> +			goto release_ctl_list_lock;
> +		}
> +
> +		if (nhi_ctxt->support_full_e2e &&
> +		    nla_put_flag(skb, NHI_ATTR_SUPPORT_FULL_E2E)) {
> +			res = -EMSGSIZE;
> +			goto release_ctl_list_lock;
> +		}
> +		mutex_unlock(&controllers_list_mutex);
> +
> +		genlmsg_end(skb, msg_head);
> +
> +		return genlmsg_reply(skb, info);
> +	}
> +
> +release_ctl_list_lock:
> +	mutex_unlock(&controllers_list_mutex);
> +	genlmsg_cancel(skb, msg_head);
> +
> +genl_put_reply_failure:
> +	nlmsg_free(skb);
> +
> +	return res;
> +}

would be about three lines of code if you used copy_to_user and a struct.


--Andy