EconomicModelInput
Bases: AnnotatedExtensionTypeWithShape
, ExtensionType
The input class used to prepare batches of data.
Attributes:
Name | Type | Description |
---|---|---|
no_prediction_mask |
Tensor
|
Mask for places we don't want to predict or train on. |
no_train_mask |
Tensor
|
Mask for places we don't want to train on. We also won't train anywhere we don't do predictions. |
feature_masks |
Tensor
|
Stacked masks for places where we don't want to train, but think that due to unforeseen externalities we should perfectly predict. Each mask will attribute the prediction error to a different driver. |
Source code in wt_ml/dataset/data_pipeline.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
dataset_to_dataframe(data, attr_name, encodings, axis_types=None, index_types=('state', 'wholesaler', 'brand', 'product'), do_sort=True)
Create a dataframe for the specific attribute of the batched data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
EconomicModelInput
|
The batched data. |
required |
attr_name |
str
|
The attribute to make into a DataFrame. |
required |
axis_types |
Sequence[Axis]
|
The axis type labels for each axis of the index of the transposed DataFrame. |
None
|
encodings |
dict[str, dict[str | int, int]]
|
The cached data and encodings about the full dataset. |
required |
Returns:
Type | Description |
---|---|
DataFrame | Series
|
A transformed and transposed DataFrame from the data with the columns equal to the batch |
DataFrame | Series
|
dimension. |
Source code in wt_ml/dataset/data_pipeline.py
546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 |
|
denormalize_z_score(batch_df, signal, encodings, normalized_on=None, mean_rescaled=False)
Denormalize z-score transformed batch_df.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_df |
DataFrame
|
The dataframe that needs denomralized. |
required |
signal |
str
|
Signal that is being transformed or name of df. |
required |
encodings |
dict[str, Any]
|
Encodings to get std and means dict for reverse transformation. |
required |
normalized_on |
str | None
|
On what level, the df is normalized on. The given level wil be mapped by f"{signal}_means_row_lookup". Defaults to None. |
None
|
mean_rescaled |
bool
|
Specifies if the data is rescaled by the mean (transformed_mean=1) or centered (transformed_mean=0). Defaults to False. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame: DataFrame after applying reverse z score transformation. |
Source code in wt_ml/dataset/data_pipeline.py
749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 |
|
get_data_attr(data, attr_name, encodings)
if data has attribute attr_name, return the numpy (often temporal) object. If attr_name is the name of an index, not of a signal (like wholesaler, product, state, etc), then return data.{attr_name}_index. For instance, for wholesaler, return data.wholesaler_index.
Source code in wt_ml/dataset/data_pipeline.py
504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 |
|