linux-kernel - Re: [PATCH RFC] ptr_ring: add ptr_ring

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <96dc3e90-b609-a1dc-7e86-95c9c533c58c@redhat.com>
Date:   Wed, 26 Apr 2017 08:10:55 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH RFC] ptr_ring: add ptr_ring_unconsume



On 2017年04月25日 23:35, Michael S. Tsirkin wrote:
> On Tue, Apr 25, 2017 at 12:07:01PM +0800, Jason Wang wrote:
>>
>> On 2017年04月24日 20:00, Michael S. Tsirkin wrote:
>>> On Mon, Apr 24, 2017 at 07:54:18PM +0800, Jason Wang wrote:
>>>> On 2017年04月24日 07:28, Michael S. Tsirkin wrote:
>>>>> On Tue, Apr 18, 2017 at 11:07:42AM +0800, Jason Wang wrote:
>>>>>> On 2017年04月17日 07:19, Michael S. Tsirkin wrote:
>>>>>>> Applications that consume a batch of entries in one go
>>>>>>> can benefit from ability to return some of them back
>>>>>>> into the ring.
>>>>>>>
>>>>>>> Add an API for that - assuming there's space. If there's no space
>>>>>>> naturally we can't do this and have to drop entries, but this implies
>>>>>>> ring is full so we'd likely drop some anyway.
>>>>>>>
>>>>>>> Signed-off-by: Michael S. Tsirkin<mst@...hat.com>
>>>>>>> ---
>>>>>>>
>>>>>>> Jason, in my mind the biggest issue with your batching patchset is the
>>>>>>> backet drops on disconnect.  This API will help avoid that in the common
>>>>>>> case.
>>>>>> Ok, I will rebase the series on top of this. (Though I don't think we care
>>>>>> the packet loss).
>>>>> E.g. I care - I often start sending packets to VM before it's
>>>>> fully booted. Several vhost resets might follow.
>>>> Ok.
>>>>
>>>>>>> I would still prefer that we understand what's going on,
>>>>>> I try to reply in another thread, does it make sense?
>>>>>>
>>>>>>>      and I would
>>>>>>> like to know what's the smallest batch size that's still helpful,
>>>>>> Yes, I've replied in another thread, the result is:
>>>>>>
>>>>>>
>>>>>> no batching   1.88Mpps
>>>>>> RX_BATCH=1    1.93Mpps
>>>>>> RX_BATCH=4    2.11Mpps
>>>>>> RX_BATCH=16   2.14Mpps
>>>>>> RX_BATCH=64   2.25Mpps
>>>>>> RX_BATCH=256  2.18Mpps
>>>>> Essentially 4 is enough, other stuf looks more like noise
>>>>> to me. What about 2?
>>>> The numbers are pretty stable, so probably not noise. Retested on top of
>>>> batch zeroing:
>>>>
>>>> no  1.97Mpps
>>>> 1   2.09Mpps
>>>> 2   2.11Mpps
>>>> 4   2.16Mpps
>>>> 8   2.19Mpps
>>>> 16  2.21Mpps
>>>> 32  2.25Mpps
>>>> 64  2.30Mpps
>>>> 128 2.21Mpps
>>>> 256 2.21Mpps
>>>>
>>>> 64 performs best.
>>>>
>>>> Thanks
>>> OK but it might be e.g. a function of the ring size, host cache size or
>>> whatever. As we don't really understand the why, if we just optimize for
>>> your setup we risk regressions in others.  64 entries is a lot, it
>>> increases the queue size noticeably.  Could this be part of the effect?
>>> Could you try changing the queue size to see what happens?
>> I increase tx_queue_len to 1100, but only see less than 1% improvement on
>> pps number (batch = 1) in my machine. If you care about the regression, we
>> probably can leave the choice to user through e.g module parameter. But I'm
>> afraid we have already had too much choices for them. Or I can test this
>> with different CPU types.
>>
>> Thanks
>>
> I agree here. Let's keep it a constant. Testing on more machines would
> be nice but not strictly required.

Ok, I will give a full benchmark (batch=1,4,64) on TCP stream to see how 
it will perform. Let's decide then.

>   I just dislike not understanding why
> it helps because it means we can easily break it by mistake.  So my only
> request really is that you wrap access to this internal buffer in an
> API. Let's see - I think we need
>
> struct vhost_net_buf
> vhost_net_buf_get_ptr
> vhost_net_buf_get_size
> vhost_net_buf_is_empty
> vhost_net_buf_peek
> vhost_net_buf_consume
> vhost_net_buf_produce

Ok. Will do in next version.

Thanks.