Skip to content

From variable to frame: design idea #174

@SylvainCorlay

Description

@SylvainCorlay

We have some ideas on how to implement data frames supporting

  • an unknown number of variables at runtime
  • heterogeneous types
  • runtime expressions

General syntax with operator|

If you have 3 xvariables (of any data types, including expressions), v1, v2, v3. A "frame" can be obtained with e.g.

auto frame = ("age", v1) | ("weight", v2) |  ("gender", v3);

the type of frame is parameterized with all the possible types of closure to xvariables that are in the frame, and basically is a map of variants on these types. To augment the frame with a new unevaluated variable e.g. v4 = v1 + v2 * v3, you can do

auto frame2 = frame | ("formula", v1 + v2 * v3);

frame2 is an xframe. The variant for the variable types held by frame2 is the union of the variant of frame and the type of v1 + v2 * v3.

Unknown number of variables read from a file without augmenting the type

  • In the case where you do something like reading a file. All variables held in the frame have the same type and we don't create a new frame type by adding new variable.
  • When you add a large number of variables with v1 | v2 | v3 | v4, it is evaluated left to right, that is: (((v1 | v2) | v3) | v4). In each case, the left-hand side is an rvalue on an xframe, which can be recycled in the new xframe. In other words, xframe&& | xvariable -> xframe can reuse the input xframe.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions