consecutive_positive_lengths(column)
Calculate the lengths of consecutive positive values in the input column.
:param column: A pandas Series or NumPy array representing a single column. :return: A NumPy array containing the lengths of consecutive positive values.
Source code in wt_ml/dataset/data_utils.py
178 179 180 181 182 183 184 185 186 187 188 189 190 |
|
get_expected_yearly_aos_vehicle_totals()
Read the yearly_aos_vehicle_totals data from cached folder and extract it into dataframe
Source code in wt_ml/dataset/data_utils.py
207 208 209 210 211 212 213 214 215 216 217 218 |
|
mean_positive_sequence_length(column)
Calculate the mean length of consecutive positive values in the input column.
:param column: A pandas Series or NumPy array representing a single column. :return: The mean length of consecutive positive values as a float.
Source code in wt_ml/dataset/data_utils.py
193 194 195 196 197 198 199 200 201 202 203 204 |
|
revenue_spread_national_media_across_geos(lnm_df, all_revenue, all_wholesaler_brand_df, geom_mean=True, rev_freq='Y')
Take national media investments at week x brand x vehicle level, and spread them geographically according to yearly/monthly/weekly, brand x wholesaler revenue. Finally, ensure that the total national investments at week x brand x vehicle level is not changed
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lnm_df |
DataFrame
|
dataframe consisting of media broken down as local, national or other media |
required |
all_revenue |
DataFrame
|
revenue for all the wholesaler, brands combined in a single dataframe |
required |
all_wholesaler_brand_df |
DataFrame
|
investments for all the wholesaler, brands combined in a single dataframe |
required |
geom_mean |
bool
|
Whether to take geometric mean of revenue based spreading and the as is population based spreading. Defaults to True. |
True
|
rev_freq |
RevAggUnit
|
Temporal aggregation of revenue for spreading. Can be yearly (Y), monthly (M) or weekly (W). Defaults to "Y". |
'Y'
|
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame: dataframe with national media investments spreaded geographically |
Source code in wt_ml/dataset/data_utils.py
286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 |
|
surrounding_rolling_average(series, weeks_surrounding)
Caculate centered rolling average excluding the current point
Source code in wt_ml/dataset/data_utils.py
221 222 223 224 225 |
|