netdev - Re: [PATCH] af_packet: add interframe drop cmsg (v6)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4ACB712C.9000709@hartkopp.net>
Date:	Tue, 06 Oct 2009 18:32:44 +0200
From:	Oliver Hartkopp <socketcan@...tkopp.net>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	Neil Horman <nhorman@...driver.com>,
	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH] af_packet: add interframe drop cmsg (v6)

Eric Dumazet wrote:
> Oliver Hartkopp a écrit :
>> Neil Horman wrote:
>>> On Tue, Oct 06, 2009 at 03:42:26PM +0200, Eric Dumazet wrote:
>>>> Neil Horman a écrit :
>>>>> Actually, no, I don't think this is sufficient.  Looking at how the
>>>>> implementation would work, we would query the sk_drop value in the recvmsg path,
>>>>> but we wouldn't record it during frame enqueue, which would skew the data I need
>>>>> to record.  Consider a queue of holding frames with sequence 1 2 3 4 101 102
>>>>> 110.  Assume that missing sequence numbers were dropped during enqueue to the
>>>>> sk_recieve_queue for lack of space.  Using the proposed above implementation,
>>>>> the frames the gap gets reported with is determined by the scheduling of the
>>>>> user space app, rather than the actual order in which they were lost.  I.e if
>>>>> frames 5-100 were received & dropped immediately after the user process called
>>>>> recvmsg for the first time, we would get the cmsg notification when we called
>>>>> recvmsg and read frame 2, but if we lost those frames after the process called
>>>>> recvmsg twice, we'd recognize the loss when we read frame 3.  With my current
>>>>> implementation we recognize the gap on the next enqueued frame after we
>>>>> encountered the loss, keeping everything in order, which is important.  We would
>>>>> need to do that with whatever socket level option we'd consider implementing.
>>>>> Not saying its not doable, but we need to take it into account, and its not a
>>>>> straightforward as simply reading sk_drops on recvmsg.  We'd have to store the
>>>>> sk_drop value in skb->mask universally or some such, and then teach sock_recvmsg
>>>>> to check that field.
>>>>>
>>>>>
>>>> Sorry cannot parse this, its too long to be true :)
>>>>
>>>> If you count every af_packet drops in sk_drops, and every time you _enqueue_
>>>> a frame in sk->receive_queue, you copy sk->sk_drops in skb->mark, you implement
>>>> the thing. All this is protected by a lock.
>>>>
>>>> sk->receive_queue is a FIFO, so there is no loss of information.
>>>>
>>>> You only need to increment sk->sk_drops when necessary, if not already done in af_packet code.
>>>>
>>>> If you dont trust me I can provide patch ;)
>>>>
>>> No I trust you.  The problem (as you just mentioned) is that you didn't parse my
>>> note.  We are saying the same thing.  If you want to implement this properly,
>>> you have to do exactly as you said, take a snapshot of the sk_drops value for
>>> the socket and store it in skb->mark, or some other unsued field, so that the
>>> drop information travels along the queue with the frames.  As I was reading
>>> Olivers note, it sounded to me as though he were proposing that we simply check
>>> sk_drops when calling sock_recvmsg, which doesn't implement the same sort of
>>> functionality.  I wanted to be sure that we were clear on what was trying to be
>>> done.
>> Puh - indeed much to read :-)
>>
>> My intention was exactly your suggestion - as you also implemented the current
>> approach:
>>
>> In socket_recvmsg the formerly stored (skb->mark) snapshot of sk_drops should
>> be used to create a cmsg, when the cmsg creation is enabled by a new setsockopt().
>>
>> Did i get i right?
>>
> 
> Yes, I could spend a moment this evening to cook a patch if you want.
> 

No objections - at least from my side :-)

> 
> int on = 1;
> setsockopt(fd, SOL_SOCKET, SO_COUNTDROPS, &on, sizeof(on));
> 
> diff --git a/include/asm-generic/socket.h b/include/asm-generic/socket.h
> index 538991c..87996c3 100644
> --- a/include/asm-generic/socket.h
> +++ b/include/asm-generic/socket.h
> @@ -63,4 +63,7 @@
>  #define SO_PROTOCOL		38
>  #define SO_DOMAIN		39
>  
> +#define SO_COUNTDROPS		40
> +#define SCM_COUNTDROPS		SO_COUNTDROPS
> +
>  #endif /* __ASM_GENERIC_SOCKET_H */
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index df7b23a..4613ca4 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -390,8 +390,10 @@ struct sk_buff {
>  	__u32			secmark;
>  #endif
>  
> -	__u32			mark;
> -
> +	union {
> +		__u32			mark;
> +		__u32			dropcount;
> +	};

Yeah - that looks better.

Regards,
Oliver

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html