lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 11 Nov 2008 15:59:45 +0100
From:	"Johann Baudy" <johaahn@...il.com>
To:	"Lovich, Vitali" <vlovich@...lcomm.com>
Cc:	"Evgeniy Polyakov" <zbr@...emap.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING

Hi Vitali,

>> You don't reset the tp_status flag when you reject a message for being too short.
>> You incorrectly calculate to_write in the situation of sock->type == SOCK_DGRAM.

I've changed this code already, I'll will share my changes.

>>
>> You don't check for the MAX_SKB_FRAGS condition.

Thanks, I'll will add this check.

>>
>> Not your code, but in packet_set_ring, it's redundant to clear the status for all the frames to TP_STATUS_KERNEL (0) since the pages are allocated with kzalloc (pages are already zeroed).  I surrounded this with a CONFIG_DEBUG_KERNEL block.

Ok, so we can skip below block according to this assumption:
                for (i = 0; i < req->tp_block_nr; i++) {
                        void *ptr = pg_vec[i];
                        int k;

                        for (k = 0; k < rb->frames_per_block; k++) {
                                __packet_set_status(po, ptr, TP_STATUS_KERNEL);
                                ptr += req->tp_frame_size;
                        }
                }

>>
>> Earlier, I think I had said that the mapped variable should be placed in packet_ring_buffer struct - I was wrong.  It's correct where it is.  I had forgotten that rx & tx are mapped contiguously and simultaneously.  There needs to be a note in the doc that the rings should be set up before mapping (i.e. cannot set up the rx or tx ring buffer first, mmap, and then try to change it).

 Yes, we can also add a check for po->mapped in
 PACKET_TX_RING/PACKET_RX_RING cases of packet_setsockopt() switch.


>> We do need still need a check for tx_pending_skb when sizing the ring buffer (in the situation where we unfreed skbs created by a call to send from Thread B, while Thread A unmaps the ring buffer and trys a resize). Also, in tpacket_snd, we need to increment mapped before allocating the skb and test if the original value was 0 (we decrement after incrementing tx_pending_skb).  If it was, then that indicates that the user had unmapped the ring buffer, which means we should stop sending.  No need for additional locking because of the atomic relationship this sets up between tx_pending_skb and mapped.

 I'm not sure to understand:

 Let me, sum up :
 * when setting or resizing the ring: we have to return an error if:
 - someone has mapped the socket (1.a)
 - sending is ongoing - tx_pending_skb != 0 (1.b)
 - a send() procedure is ongoing (1.c)

 * when closing the socket, we have to return an error if:
 - sending is ongoing - tx_pending_skb != 0 (2.a)
 - a send() procedure is ongoing (2.b)

 * during do-while loop of send(),:
 - protect ring against simultaneous send() (3.a)
 - ensure that ring is not resized (3.b) (linked with 1.c)
 Our main issue is when: tx_pending_skb = 0, Thread A starts to send a
 packet while Thread B unmaps and resizes the ring.
  Thread B can potentially resizes the ring until increment of
 tx_pending_skb in Thread A.
 - loop can continue even if ring is not mapped (3.c), this has no
 sense but it is not critical

 * when executing the destructor callback:
 - we must ensure that ring structure has not changed since send() (4.a)

 (1.a) can be done with po->mmaped check in packet_setsockopt()
 (1.b) and (2.a) can be done as you said :  tx_pending_skb != 0 check
 when sizing buffer in packet_set_ring()
 (3.a) by a mutex M outside of the do-while loop
 (1.c), (2.b) and (3.b) can be performed thanks to the same mutex M and
 mutex_trylock() in packet_set_ring(). An error is returned if we can't
 take the mutex . The mutex is released at the end of packet_set_ring()
 call.
 (4.a) is validated with tx_pending_skb increment, (3.b),(1.b) and (2.a)

 Do I miss something?

 Many thanks,
 Johann
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ