netdev - Re: [PATCH 2/7][RFC] netfilter: add xt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bogygots.fsf@xmission.com>
Date:	Sat, 22 Sep 2012 06:38:23 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	John Stultz <john.stultz@...aro.org>
Cc:	LKML <linux-kernel@...r.kernel.org>, JP Abgrall <jpa@...gle.com>,
	netdev@...r.kernel.org, Ashish Sharma <ashishsharma@...gle.com>,
	Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@...el.com>
Subject: Re: [PATCH 2/7][RFC] netfilter: add xt_qtaguid matching module


I just took a quick skim through the patch, and spotted a few things.

This patch is going to need to handle current_fsuid returning a kuid_t.

The permission checks are weird, and ancient looking.

There is no network namespace handling, to the point I don't think
the code will function properly if network namespace support is enabled
in the kernel.

The code probably wants to use struct pid *pid, instead of pid_t values.
Both to cope with the pid namespace and to avoid issues with pid wrap
around.

A few more comments below.

Eric



John Stultz <john.stultz@...aro.org> writes:

> From: JP Abgrall <jpa@...gle.com>
>
> This module allows tracking stats at the socket level for given UIDs.
> It replaces xt_owner.
> If the --uid-owner is not specified, it will just count stats based on
> who the skb belongs to. This will even happen on incoming skbs as it
> looks into the skb via xt_socket magic to see who owns it.
> If an skb is lost, it will be assigned to uid=0.
>
> To control what sockets of what UIDs are tagged by what, one uses:
>   echo t $sock_fd $accounting_tag $the_billed_uid \
>      > /proc/net/xt_qtaguid/ctrl
>  So whenever an skb belongs to a sock_fd, it will be accounted against
>    $the_billed_uid
>   and matching stats will show up under the uid with the given
>    $accounting_tag.
>
> Because the number of allocations for the stats structs is not that big:
>   ~500 apps * 32 per app
> we'll just do it atomic. This avoids walking lists many times, and
> the fancy worker thread handling. Slabs will grow when needed later.
>
> It use netdevice and inetaddr notifications instead of hooks in the core dev
> code to track when a device comes and goes. This removes the need for
> exposed iface_stat.h.
>
> Put procfs dirs in /proc/net/xt_qtaguid/
>   ctrl
>   stats
>   iface_stat/<iface>/...
> The uid stats are obtainable in ./stats.
>
> Cc: netdev@...r.kernel.org
> Cc: JP Abgrall <jpa@...gle.com>
> Cc: Ashish Sharma <ashishsharma@...gle.com>
> Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@...el.com>
> Signed-off-by: JP Abgrall <jpa@...gle.com>
> [Dropped paranoid network ifdefs and minor compile fixups -jstultz]
> Signed-off-by: John Stultz <john.stultz@...aro.org>
> ---


> +static struct qtaguid_event_counts qtu_events;
> +/*----------------------------------------------*/
> +static bool can_manipulate_uids(void)
> +{

You probably want something like a test for capable(CAP_SETUID).
Certainly raw check for the uid == 0 fell out of common practice
for this sort of thing a decade or so ago.
> +	/* root pwnd */
> +	return unlikely(!current_fsuid()) || unlikely(!proc_ctrl_write_gid)
> +		|| in_egroup_p(proc_ctrl_write_gid);
> +}
> +
> +static bool can_impersonate_uid(uid_t uid)
> +{
> +	return uid == current_fsuid() || can_manipulate_uids();
> +}
> +
> +static bool can_read_other_uid_stats(uid_t uid)
> +{
Perhaps may_ptrace?
> +	/* root pwnd */
> +	return unlikely(!current_fsuid()) || uid == current_fsuid()
> +		|| unlikely(!proc_stats_readall_gid)
> +		|| in_egroup_p(proc_stats_readall_gid);
> +}
> +


> +/*
> + * Track tag that this socket is transferring data for, and not necessarily
> + * the uid that owns the socket.
> + * This is the tag against which tag_stat.counters will be billed.
> + * These structs need to be looked up by sock and pid.
> + */
> +struct sock_tag {
> +	struct rb_node sock_node;
> +	struct sock *sk;  /* Only used as a number, never dereferenced */
> +	/* The socket is needed for sockfd_put() */
> +	struct socket *socket;
> +	/* Used to associate with a given pid */
> +	struct list_head list;   /* in proc_qtu_data.sock_tag_list */
> +	pid_t pid;
You probably want this to be "struct pid *pid;"
> +
> +	tag_t tag;
> +};


> + * The qtu uid data is used to track resources that are created directly or
> + * indirectly by processes (uid tracked).
> + * It is shared by the processes with the same uid.
> + * Some of the resource will be counted to prevent further rogue allocations,
> + * some will need freeing once the owner process (uid) exits.
> + */
> +struct uid_tag_data {
> +	struct rb_node node;
> +	uid_t uid;

You probably want to say kuid_t uid here.

> +	/*
> +	 * For the uid, how many accounting tags have been set.
> +	 */
> +	int num_active_tags;
> +	/* Track the number of proc_qtu_data that reference it */
> +	int num_pqd;
> +	struct rb_root tag_ref_tree;
> +	/* No tag_node_tree_lock; use uid_tag_data_tree_lock */
> +};


> +struct proc_qtu_data {
> +	struct rb_node node;
> +	pid_t pid;

I suspect you want to use "struct pid *pid" here.
> +
> +	struct uid_tag_data *parent_tag_data;
> +
> +	/* Tracks the sock_tags that need freeing upon this proc's death */
> +	struct list_head sock_tag_list;
> +	/* No spinlock_t sock_tag_list_lock; use the global one. */
> +};
> +

> diff --git a/net/netfilter/xt_qtaguid.c b/net/netfilter/xt_qtaguid.c
> new file mode 100644
> index 0000000..7c4ac46
> +/*------------------------------------------*/
> +static int __init qtaguid_proc_register(struct proc_dir_entry **res_procdir)
> +{
> +	int ret;
> +	*res_procdir = proc_mkdir(module_procdirname, init_net.proc_net);

You probably want to handle all of the network namespaces.
> +	if (!*res_procdir) {
> +		pr_err("qtaguid: failed to create proc/.../xt_qtaguid\n");
> +		ret = -ENOMEM;
> +		goto no_dir;
> +	}
> +
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html