lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 22 Oct 2017 20:23:19 +0200
From:   Pantelis Antoniou <pantelis.antoniou@...sulko.com>
To:     Grant Likely <grant.likely@...retlab.ca>
Cc:     Frank Rowand <frowand.list@...il.com>,
        David Gibson <david@...son.dropbear.id.au>,
        Tom Rini <trini@...sulko.com>,
        Rob Herring <robherring2@...il.com>,
        Franklin S Cooper Jr <fcooper@...com>,
        Matt Porter <mporter@...sulko.com>,
        Simon Glass <sjg@...omium.org>,
        Phil Elwell <philip.j.elwell@...il.com>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Marek Vasut <marex@...x.de>,
        Devicetree Compiler <devicetree-compiler@...r.kernel.org>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] yamldt v0.5, now a DTS compiler too

Hi Grant,

> On Oct 22, 2017, at 19:54 , Grant Likely <grant.likely@...retlab.ca> wrote:
> 
> On Fri, Oct 20, 2017 at 8:16 PM, Pantelis Antoniou
> <pantelis.antoniou@...sulko.com> wrote:
>> Hi Grant,
>> 
>>> On Oct 20, 2017, at 20:46 , Grant Likely <grant.likely@...retlab.ca> wrote:
>>> 
>>> On Thu, Sep 28, 2017 at 8:58 PM, Pantelis Antoniou
>>> <pantelis.antoniou@...sulko.com> wrote:
>>>> Hello again,
>>>> 
>>>> Significant progress has been made on yamldt and is now capable of
>>>> not only generating yaml from DTS source but also compiling DTS sources
>>>> and being almost fully compatible with DTC.
>>>> 
>>>> Compiling the kernel's DTBs using yamldt is as simple as using a
>>>> DTC=yamldt.
>>>> 
>>>> Error reporting is accurate and validation against a YAML based schema
>>>> works as well. In a short while I will begin posting patches with
>>>> fixes on bindings and DTS files in the kernel.
>>>> 
>>>> Please try it on your platform and report if you encounter any problems.
>>>> 
>>>> https://github.com/pantoniou/yamldt
>>>> 
>>>> I am eagerly awaiting for your comments.
>>> 
>>> Hi Pantelis,
>>> 
>>> This is good work. I've played around with it and I'm looking forward
>>> to chatting next week.
>>> 
>> 
>> Thanks. I’m looking forward to it too.
>> 
>>> One thing I've done is tried loading the output YAML files into
>>> another YAML interpreter and the current encoding causes problems.
>>> Specifically, in yamldt anchors/aliases are being used as a
>>> replacement for labels/phandles, but that conflicts with the YAML data
>>> model which defines a reference as a way to make a copy of the data
>>> appear in another part of the tree. For example, for the following
>>> snippit:
>>> 
>>> intc: intc@...00 {
>>>   #interrupt-cells = <1>;
>>>   compatible = "acme,intc";
>>>   reg = <0x10000 0x1000>;
>>>   gpio-controller;
>>> };
>>> 
>>> serial@...00 {
>>>   compatible = "acme,uart";
>>>   reg = <0x20000 0x1000>;
>>>   interrupt-parent = <&intc>;
>>>   interrupts = <5>;
>>> };
>>> 
>>> yamldt will encode this as:
>>> 
>>> intc@...00: &intc
>>>   "#interrupt-cells": 1
>>>   compatible: acme,intc
>>>   reg: [0x10000, 0x1000]
>>>   gpio-controller:
>>> 
>>> serial@...00:
>>>   compatible: acme,uart
>>>   reg: [0x20000, 0x1000]
>>>   interrupt-parent: *intc
>>>   interrupts: 5
>>> 
>>> But, the expected behaviour for a YAML parser is expand the alias
>>> '*intc' which results in the following structure:
>>> 
>>> intc@...00: &intc
>>>   "#interrupt-cells": 1
>>>   compatible: acme,intc
>>>   reg: [0x10000, 0x1000]
>>>   gpio-controller:
>>> 
>>> serial@...00:
>>>   compatible: acme,uart
>>>   reg: [0x20000, 0x1000]
>>>   interrupt-parent:
>>>       "#interrupt-cells": 1
>>>       compatible: acme,intc
>>>       reg: [0x10000, 0x1000]
>>>       gpio-controller:
>>>   interrupts: 5
>>> 
>>> See? It results in the entire interrupt controller node to appear as
>>> an instance under the interrupt-parent property, when the intention is
>>> only to create a phandle. yamldt should not redefine the behaviour of
>>> '*' aliases. Instead, it should use a different indicator, either
>>> using an explicit !phandle tag, or by replacing '*' with something
>>> else. I worked around it in my tests by replacing '*' with '$’.
>> 
>> Yes. This is expected. I don’t think pure YAML form is a good match for all
>> the crazy things that are possible (and actually used in practice) with DTS.
> 
> I don’t think it is as dire as that. The DTS structure is not complex
> and I think can easily be mapped into pure YAML. But, it does require
> treating the DTS structure separately from its semantic meaning. For
> example, the grammar of nodes, properties and labels easily maps to
> pure YAML, but phandles and overlay trees have semantic meaning that
> must be resolved by DT specific code. I’ll respond to you’re specific
> examples below…
> 

We are in complete agreement here. Single nodes and properties map to YAML perfectly.
It’s the complex way that we build the complete DTB that stress the YAML structures.

>> 
>> For instance there’s no way a normal YAML parser won’t choke with something like
>> 
>> / {
>>  foo;
>> };
>> 
>> / {
>>  bar;
>> };
>> 
>> Which is a common idiom in DTS files.
> 
> That’s only true when the top level of nodes is encoded as a map, but
> it shouldn’t be. It should be a list instead, for two reasons. First,
> order matters for the top level. Subsequent top level trees have to be
> applied in order to fully resolve the tree, but a map doesn’t preserve
> the ordering. Second, as you rightly point out, the same name can be
> used for multiple top level trees. So, the encoding should be a list
> with each list entry containing the node path {/path,&ref}, and the
> node data. One possible way to do this is with a tuple:
> 
> treedata:
> - - /
>  - foo:
> - - /
>  - bar:
> 
> Another option is to add a special property in the node to contain the name:
> 
> treedata:
> - $path: /
>  foo:
> - $path: /
>  bar:
> 
> Personally, I prefer the special property approach. That would also be
> a good way to encode labels
> 

The YAML bare sequence is problematic; I’ll explain below.
 
>> The decision to use the YAML anchors and references was more borne out of a desire
>> to sort of match conceptually the labels and references of DTS. It’s not a big
>> issue to switch to something different.
> 
> I think it would be a good idea. One reason for defining a YAML
> encoding is to be able to use existing tools to work with the data.
> Changing the meaning of anchors and aliases breaks those tools
> immediately.
> 

Hmm, I see what you mean. Semantically this mapping is more correct but it does
present problems to non-aware tools.

>> If we were to force YAML/DT file to be regular YAML files parseable by everything
>> we’d have to rethink a lot of conventions that DT files currently use and I’m afraid
>> a machine translation from DTS to YAML might not be feasible then.
>> 
>> OTOH moving to that model might make it possible to use YAML constructs that are not
>> mapped at all to DTS. For instance
>> 
>> - foo: true
>>  bar: “string”
>> 
>> - foo: false
>>  bar: “another-string”
>> 
>> is not possible to be mapped to a DT node/property structure right now.
> 
> I’m not following. Are you saying DT has no way to encode a list of
> nodes? What use case are you imagining?
> 

This threw me off too at first.

The problem is when you try to convert this into a live tree structure.

The YAML encoding is

+SEQ
 +MAP
  =VAL :foo
  =VAL :true
  =VAL :bar
  =VAL "string
 -MAP
 +MAP
  =VAL :foo
  =VAL :false
  =VAL :bar
  =VAL "another-string
 -MAP
-SEQ

So it’s a sequence of MAPs.

In a live tree DTS form would be

noname-property = { foo=true; bar = “string”; }, { foo = false; bar=“another-string”; };

We don’t have the capability as right now to express something like this.

Namely properties containing nodes, and the root of the live tree not being a node.

>> 
>>> 
>> 
>>> Plus, it would be useful to use normal YAML anchors/aliases for
>>> creating node templates. For example:
>>> 
>>> serial-template: &acme-uart . # The anchor for the template
>>>   compatible: acme,uart
>>>   interrupt-parent: *intc
>>> 
>>> root:
>>>   serial@...00:
>>>       <<: *acme-uart   # Alias node merged into serial@...00
>>>       interrupts: 5
>>>       reg: [0x20000, 0x1000]
>>>   serial@...00:
>>>       <<: *acme-uart   # Alias node merged into serial@...00
>>>       interrupts: 5
>>>       reg: [0x30000, 0x1000]
>>> 
>> 
>> Yes, I’m aware of this and I planned to talk to you about it :)
>> It can be a very powerful way to cut down the churn of DT right now.
>> 
>> It’s not going to be a problem for yamldt to support it (perhaps
>> adding a new editing tags to allow more fine grained control of the
>> substitions).
>> 
>>> Another problem with anchors/references is YAML seems to require the
>>> anchor to be defined before the reference, or at least that's what
>>> pyyaml and ruamel both expect. Regardless, The chosen YAML encoding
>>> should be readily consumable by existing yaml implementations without
>>> having to do a lot of customization.
>>> 
>> 
>> I think this is only true for streamable YAML implementations; yamldt
>> can leave references unresolved until the emit phase (or even leave them
>> be as object files).
> 
> it is possible (probable) that I’m just not familiar with how to
> change that behaviour in pyyaml and ruamel. However, the point is moot
> if we can agree to preserve pure YAML semantics.
> 

Agreed.

>>> I'm slightly concerned about using & anchors for labels because it
>>> seems only one anchor can be defined per node, but DTC allows multiple
>>> labels for a single node. This might not be an issue in practice
>>> though. Another implementation issue related to using & anchors is the
>>> YAML spec defines them as an encoding artifact, and parsers can
>>> discard the anchor names after parsing the YAML structure, which is a
>>> problem if we use something like $name to reference an anchor. The
>>> solution might just be that labels need to go into a special property
>>> so they don't disappear from the data stream.
>>> 
>> 
>> I do support multiple labels to the same node in yamldt with the
>> following trick, that uses an empty tree.
>> 
>> foo: bar: baz { frob; };
>> 
>> bar: &foo
>>  frob: true
>> 
>> bar: &bar
>>  ~: ~
>> 
>> Special properties should work, albeit being a little bit cludgy.
> 
> With special properties, it might look like this:
> 
> baz:
>  $labels: [foo, bar]
>  frob:
> 

Yeah, that would work.

>> 
>>> There appears to be no place to put metadata. The root of the tree is
>>> the top level of the YAML structure. There isn't any provision for
>>> having a top level object to hold things like the memreserve map. We
>>> may need a namespace to use for special properties that aren't nodes
>>> or properties.
>>> 
>> 
>> I don’t think that’s going to be necessary. Metadata are artifacts that
>> only have meaning when emitting a serialized form like dtb.
>> 
>> yamldt fully supports /memreserve/ by simply treating a top level
>> /memreserve/ property as special and constructing the memreserve
>> entries at the emit phase.
>> 
>> so
>> 
>> /memreserve/ 1000 10;
>> 
>> is simply:
>> 
>> /memreserve/: [ 1000 10 ]
>> 
>> It’s a matter of definition what properties are encoded as metadata for the
>> serialized form.
>> 
>>> The encoding differentiates between nodes and properties implicitly
>>> base on whether the contents are a map, or a scalar/list. This does
>>> mean any parser needs to do a bit more work and it may impact what can
>>> be done with validation (I'm not going to talk about validation in
>>> this email though. We'll talk next week.)
>> 
>> Hmm, I’m not sure I follow this. Is that related to the metadata problem.
> 
> It has to do with iterating over properties or nodes. Nodes and props
> are mixed together, and any code processing them has to explicitly
> check the type to differentiate between the two. The alternative would
> be to collect all properties under a single key:
> 
> parent:
>  $props:
>    foo: hello
>    bar: world
>  child1: {}
>  child2: {}
> 

That’s no problem. It does look weird though. I am using the same parser
for parsing bindings and this might make the files a little bit weird.

Keep in mind that we have two YAML formats to parse; the hardware description
and the bindings describing them.

>> 
>> BTW, a blog post explaining the rationale behind yamldt is going to come
>> up soon at our website, I’ll post the link when it does.
> 
> Cool. I see it got posted, but didn’t get a chance to read it before I
> got on this airplane. Will try to read this evening.
> 
> (BTW, I’m arriving at about 11:00 tonight)
> 

Have a safe flight, you’ll get to beat me up tomorrow :)

> Cheers,
> g.

Regards

— Pantelis

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ