netdev - Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it by PERF_SAMPLE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5695D3CB.3030604@huawei.com>
Date:	Wed, 13 Jan 2016 12:34:19 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC:	<acme@...nel.org>, <linux-kernel@...r.kernel.org>,
	<pi3orama@....com>, <lizefan@...wei.com>, <netdev@...r.kernel.org>,
	<davem@...emloft.net>, Adrian Hunter <adrian.hunter@...el.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	David Ahern <dsahern@...il.com>,
	Ingo Molnar <mingo@...nel.org>,
	Yunlong Song <yunlong.song@...wei.com>
Subject: Re: [PATCH 27/53] perf/core: Put size of a sample at the end of it
 by PERF_SAMPLE_TAILSIZE



On 2016/1/13 3:56, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 08:36:23PM +0800, Wangnan (F) wrote:
>>> hmm, in this kernel patch I see that you're adding 8 bytes for
>>> every record via this extra TAILSISZE flag and in perf you're
>>> walking the ring buffer backwards by reading this 8 byte
>>> sizes, comparing header sizes and so on until reaching beginning,
>>> where you start dumping it as normal.
>>> So for this 'signal to perf' approach to work the ring buffer
>>> will contain tailsizes everywhere just so that user space can
>>> find the beginning. That's not very pretty. imo if kernel
>>> can do header read to adjust data_tail it would make user
>>> space side clean. May be there are other solutions.
>>> Adding tailsize seems like brute force hack.
>>> There must be some nicer way.
>> Hi Peter,
>>
>>   What's your opinion? Should we reconsider moving size field from header the
>> end?
>> Or moving whole header to the end of a record?
> I think moving the whole header under new TAILHEADER flag is
> actually very good idea. The ring buffer will be fully utilized
> and no extra bytes necessary. User space would need to parse it
> backwards, but for this use case it fits well.

I have another crazy suggestion: can we make kernel writing to
the ring buffer from the end to the beginning? For example:

This is the initial state of the ring buffer, head pointer
pointes to the end of it:

       -------------> Address increase

                                     head
                                       |
                                       V
  +--+---+-------+----------+------+---+
  |                                    |
  +--+---+-------+----------+------+---+


Write the first event at the end of the ring buffer, and *decrease*
the head pointer:

                                 head
                                   |
                                   V
  +--+---+-------+----------+------+---+
  |                                | A |
  +--+---+-------+----------+------+---+


Another record:
                           head
                            |
                            V
  +--+---+-------+----------+------+---+
  |                         |   B  | A |
  +--+---+-------+----------+------+---+


Ring buffer rewind, A is fully overwritten and B is broken:

                                head
                                  |
                                  V
  +--+---+-------+----------+-----+----+
  |F | E |   D   | C        | ... | F  |
  +--+---+-------+----------+-----+----+

At this time user can parse the ring buffer normally from
F to C. From timestamp in it he know which one is the
oldest.

By this perf don't need too much extra work to do. There's no
performance penalty at all, and the 8 bytes are saved.

Thought?

Thank you.