netdev - Re: [PATCH net-next 1/5] bpf: use __GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8f22f67e-437b-6533-4c87-e13d4fe33f95@oracle.com>
Date:   Wed, 12 Sep 2018 13:15:48 -0700
From:   Tushar Dave <tushar.n.dave@...cle.com>
To:     John Fastabend <john.fastabend@...il.com>, ast@...nel.org,
        daniel@...earbox.net, davem@...emloft.net,
        santosh.shilimkar@...cle.com, jakub.kicinski@...ronome.com,
        quentin.monnet@...ronome.com, jiong.wang@...ronome.com,
        sandipan@...ux.vnet.ibm.com, kafai@...com, rdna@...com, yhs@...com,
        netdev@...r.kernel.org, rds-devel@....oracle.com,
        sowmini.varadhan@...cle.com
Subject: Re: [PATCH net-next 1/5] bpf: use __GFP_COMP while allocating page



On 09/12/2018 09:51 AM, John Fastabend wrote:
> On 09/12/2018 09:21 AM, Tushar Dave wrote:
>>
>>
>> On 09/11/2018 12:38 PM, Tushar Dave wrote:
>>> Helper bpg_msg_pull_data() can allocate multiple pages while
>>> linearizing multiple scatterlist elements into one shared page.
>>> However, if the shared page has size > PAGE_SIZE, using
>>> copy_page_to_iter() causes below warning.
>>>
>>> e.g.
>>> [ 6367.019832] WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825
>>> page_copy_sane.part.8+0x0/0x8
>>>
>>> To avoid above warning, use __GFP_COMP while allocating multiple
>>> contiguous pages.
>>>
>>> Signed-off-by: Tushar Dave <tushar.n.dave@...cle.com>
>>> ---
>>>    net/core/filter.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index d301134..0b40f95 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -2344,7 +2344,8 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>>        if (unlikely(bytes_sg_total > copy))
>>>            return -EINVAL;
>>>    -    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC, get_order(copy));
>>> +    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
>>> +               get_order(copy));
>>>        if (unlikely(!page))
>>>            return -ENOMEM;
>>>        p = page_address(page);
>>
>> I should have mentioned that I could re-order this patch anywhere in
>> patch series (as long as it doesn't break git bisect). I kept it first
>> because I think it is more like a bug fix. I sent it along with these
>> patch series considering we have a context of why and for what I need
>> this patch!
>>
>> Daniel, John,
>>
>> Not sure if you guys hit this page_copy_sane warning. I hit it when RDS
>> copy sg page to userspace using copy_page_to_iter().
>>
> 
> I have not hit this before but I'm working on a set of patches for
> test_sockmap to test the bpf_msg_pull_data() so I'll add a case
> for this. Currently, we only test the simple case where we pull
> data out of a single page in selftests. This was sufficient for
> my use case but missed a handful of other valid cases.
> 
>> example:
>>
>> RDS packet size 8KB represented in scatterlist:
>> sg_data[0].length = 1400
>> sg_data[1].length = 1448
>> sg_data[2].length = 1448
>> sg_data[3].length = 1448
>> sg_data[4].length = 1448
>> sg_data[5].length = 1000
>>
>> If start=0 and end=8192, bpf_msg_pull_data() will linearize all
>> sg_data elements into one shared page. e.g. sg_data[0].length = 8192.
>> Using this sg_data[0].page in function copy_page_to_iter() causes:
>> WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825
>> page_copy_sane.part.8+0x0/0x8
>>
>> (FYI, patch 4 has code that does copy_page_to_iter)
>>
> 
> How about sending it as a bugfix against bpf on its own. It
> looks like we could reproduce this with a combination of
> bpf_msg_pull_data() + redirect (to ingress) perhaps. Either
> way seems like a candidate for the bpf fixes tree to me.

Done.

Thanks.
-Tushar
> 
> Thanks,
> John
> 
>>
>> Comments?
>>
>> Thanks in advance,
>> -Tushar
>>
>>
> 
>