A fold
transformation reshapes the data by unpivoting it.
Suppose a table with variables, column1
, column2
, category1
, and category2
.
Folding column1
and column2
by category1
results in key
, value
, category1
, and category2
like below.
Before
column1 | column2 | category1 | category2 |
---|---|---|---|
1 | 2 | ‘A’ | ‘a’ |
3 | 5 | ‘B’ | ‘b’ |
4 | 6 | ‘C’ | ‘c’ |
After
key | value | category1 | category2 |
---|---|---|---|
column1 | 1 | ‘A’ | ‘a’ |
column2 | 2 | ‘A’ | ‘a’ |
column1 | 3 | ‘B’ | ‘b’ |
column2 | 5 | ‘B’ | ‘b’ |
column1 | 4 | ‘C’ | ‘c’ |
column2 | 6 | ‘C’ | ‘c’ |
Fold
properties
Property | Type | Description |
---|---|---|
fold |
Array[String] |
(Required) An array of field names to fold. |
by |
String |
(Required) A nominal field to group by. |
exclude |
Boolean |
(Optional, default: false ) Whether to drop other fields (not specified by fold and by ). |
as |
Array[String, length=2] |
(Optional, default: ['key', 'value'] ) New field names for folded variables. |
Usage pattern
JSON
{
...
"transform" : [
{
"fold": [
"column1", "column2"
],
"by": "category1",
"exclude": true,
"as": [
"measure", "value"
]
},
]
...
}
JavaScript
let stream = new Erie.Stream();
...
let fold = new Erie.Fold(["column1", "column2"], "category1"); // filter expression
// alt) let fold = new Erie.Fold();
// fold.Fold("datum.cost > 30");
// fold.by("category1");
fold.exclude(true);
fold.as(["measure", "value"]);
stream.transform.add(fold);
...
Extended pattern with a repeat
channel
Using fold
with a repeat
channel enables expressing intervals repeated over a field.
JSON
{
...
"transform" : [
{
"aggregate": [
{
"op": "mean",
"field": "Miles_per_Gallon",
"as": "Miles_per_Gallon_mean"
},
{
"op": "stdevp",
"field": "Miles_per_Gallon",
"as": "Miles_per_Gallon_stdevp"
}
],
"groupby": [
"Origin"
]
},
{
"calculate": "datum.Miles_per_Gallon_mean - datum.Miles_per_Gallon_stdevp",
"as": "Miles_per_Gallon_lower"
},
{
"calculate": "datum.Miles_per_Gallon_mean + datum.Miles_per_Gallon_stdevp",
"as": "Miles_per_Gallon_upper"
},
{
"fold": [
"Miles_per_Gallon_lower",
"Miles_per_Gallon_mean",
"Miles_per_Gallon_upper"
],
"by": "Origin",
"exclude": true,
"as": [
"measure",
"statistics"
]
}
],
...
"encoding": {
"time": {
"field": "measure",
"type": "nominal",
"scale": {
"band": 0.5,
"order": [
"Miles_per_Gallon_lower",
"Miles_per_Gallon_mean",
"Miles_per_Gallon_upper"
]
}
},
...
"repeat": {
"field": "Origin",
...
}
...
},
...
}
JavaScript
let stream = new Erie.Stream();
...
let aggregate = new Erie.Aggregate();
aggregate.add("mean", "Miles_per_Gallon", "Miles_per_Gallon_mean");
aggregate.add("stdevp", "Miles_per_Gallon", "Miles_per_Gallon_stdevp");
aggregate.groupby(["Origin"]);
stream.transform.add(aggregate);
let calc1 = new Erie.Calculate("datum.Miles_per_Gallon_mean - datum.Miles_per_Gallon_stdevp")
calc1.as("Miles_per_Gallon_lower");
stream.transform.add(calc1);
let calc2 = new Erie.Calculate("datum.Miles_per_Gallon_mean + datum.Miles_per_Gallon_stdevp")
calc2.as("Miles_per_Gallon_upper");
stream.transform.add(calc2);
let fold = new Erie.Fold([
"Miles_per_Gallon_lower",
"Miles_per_Gallon_mean",
"Miles_per_Gallon_upper"
], "Origin");
fold.exclude(true);
fold.as(["measure", "statistics"]);
stream.transform.add(fold);
...
stream.encoding.time.field("measure", "nominal");
stream.encoding.time.scale("timing", "relative");
stream.encoding.time.scale("band", 0.5);
stream.encoding.time.scale("order", [
"Miles_per_Gallon_lower",
"Miles_per_Gallon_mean",
"Miles_per_Gallon_upper"
]);
...
stream.encoding.repeat.field("Origin");
...