linux-kernel - Re: [RFC PATCH v4 10/29] bpf tools: Collect map definitions from 'maps' section

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <556686FE.105@huawei.com>
Date:	Thu, 28 May 2015 11:09:50 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:	<paulus@...ba.org>, <a.p.zijlstra@...llo.nl>, <mingo@...hat.com>,
	<acme@...nel.org>, <namhyung@...nel.org>, <jolsa@...nel.org>,
	<dsahern@...il.com>, <daniel@...earbox.net>,
	<brendan.d.gregg@...il.com>, <masami.hiramatsu.pt@...achi.com>,
	<lizefan@...wei.com>, <linux-kernel@...r.kernel.org>,
	<pi3orama@....com>, xiakaixu 00238161 <xiakaixu@...wei.com>
Subject: Re: [RFC PATCH v4 10/29] bpf tools: Collect map definitions from
 'maps' section



On 2015/5/28 10:28, Alexei Starovoitov wrote:
> On Thu, May 28, 2015 at 10:03:04AM +0800, Wangnan (F) wrote:
>>
>> On 2015/5/28 9:53, Alexei Starovoitov wrote:
>>> On Wed, May 27, 2015 at 05:19:45AM +0000, Wang Nan wrote:
>>>> If maps are used by eBPF programs, corresponding object file(s) should
>>>> contain a section named 'map'. Which contains map definitions. This
>>>> patch copies the data of the whole section. Map data parsing should be
>>>> acted just before map loading.
>>>>
>>>> Signed-off-by: Wang Nan <wangnan0@...wei.com>
>>>> ---
>>> ...
>>>> +static int
>>>> +bpf_object__init_maps(struct bpf_object *obj, void *data,
>>>> +		      size_t size)
>>>> +{
>>>> +	if (size == 0) {
>>>> +		pr_debug("%s doesn't need map definition\n",
>>>> +			 obj->path);
>>>> +		return 0;
>>>> +	}
>>>> +
>>>> +	obj->maps_buf = malloc(size);
>>>> +	if (!obj->maps_buf) {
>>>> +		pr_warning("malloc maps failed: %s\n", obj->path);
>>>> +		return -ENOMEM;
>>>> +	}
>>>> +
>>>> +	obj->maps_buf_sz = size;
>>>> +	memcpy(obj->maps_buf, data, size);
>>> why copy it? To create maps and apply fixups to instructions
>>> relo sections are needed anyway, so elf has to be open while
>>> this section is being processed. So why copy?
>>>
>> When creating maps, ELF file has been closed.
>>
>> I divide libelf info two phases: opening and loading. ELF file is closed
>> at the end of opening phase. I think some caller need 'opening' phase only.
>> For example, checking metadata in an eBPF object file. In this case, we
>> don't
>> need create map file descriptors.
> loading elf into memory, parsing it, copying map, prog, relo sections
> just to check metadata? That doesn't sound like real use case.
> imo it's cleaner to remember where maps and relocations are in a loaded elf,
> then create maps, patch copied progs and release all elf.
> This elfs are all very small, so we're not talking about large memory savings,
> but still.
>

So do you suggest me to create maps in opening phase?

In bpf_object__open:

struct bpf_object *bpf_object__open(const char *path)
{
        ....
        if (bpf_object__elf_init(obj))
                goto out;

        /* Real useful things put here */
        ....
        /* Here we collect map information */
        if (bpf_object__elf_collect(obj))
                goto out;
        ....
        /* And ELF file is closed here */
        bpf_object__elf_finish(obj);
        ....
}

You can see that, after bpf_object__open() return we won't have chance
to access map data. Therefore we must create maps in bpf_object__open().

However this breaks a law in current design that opening phase doesn't
talk to kernel with sys_bpf() at all. All related staff is done in loading
phase. This principle ensures that in every systems, no matter it support
sys_bpf() or not, can read eBPF object without failure.

In fact I didn't separate opening and loading when I start working on 
libbpf.
However I soon found inconvenience that:
   1. The uniform design doesn't allow users to adjust things before 
doing real work;
   2. In my development environment I write code on a server without 
sys_bpf() support,
      the uniform design prevent me to test my opening phase code. I 
have to test it
      in QEMU.

In addition, this copying gives libbpf an ability that it can open once and
load - unload - load - unload many times without reopening and reparsing the
ELF file.

Moreover, we are planning to introduce hardware PMU to eBPF in the way 
like maps,
to give eBPF programs the ability to access hardware PMU counter. I 
haven't think
it thoroughly so I didn't discuss it with you and others. I think it 
should be
something like:

struct bpf_pmu {
   /* attr of the hardware PMU which will be passed to perf_event_open 
to create an FD */
};

SEC("hw_pmu")
struct bpf_pmu cache_misses = {
    ...
};

SEC("lock_page=lock_page")
int lock_page_hook(struct pt_regs *ctx)
{
     ...
     counter = bpf_read_pmu_counter(&cache_misses);
     ...
}

(My colleague Xia Kaixu is working on it. I append him to the CC list).
Creating that PMU FDs may require perf to adjust more things than 
programs and maps.
I believe that we shouldn't let libbpf to do its own without help from 
caller. Therefore
the separation of opening and loading should be required.

What do you think?

Thank you.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/