Example 02¶
While E01 demonstrates a Python-based approach, we demonstrate the command-line interface (CLI) option of vaxstats
here.
DataFrame¶
We will assume you already have example.xlsx
in your current work directory; if not, you can get it by running the following bash command.
Verifying format¶
Excel often displays automatically formatted cells when it comes to dates and times.
First, we will use vaxstats peak
to examine how polars will read our file.
For example, if you open example.xlsx
with Excel or LibreOffice Calc, you will see the first and second columns contain 8/10/2023
and 12:31:34 PM
, respectively.
Command:
Output:
INFO | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO | Loading file from: `example.xlsx`
shape: (3, 10)
┌──────────┬─────────────┬──────────────────┬────────────────┬───┬──────────────────┬──────────────────┬─────┬─────────────────┐
│ Date ┆ Time ┆ TimeZone ┆ ElapsedTime ┆ … ┆ T_NPMN(Celsius): ┆ A_NPMN(Counts):A ┆ 2 ┆ Time_duplicated │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ Temperat ┆ ctivity ┆ --- ┆ _0 │
│ str ┆ str ┆ str ┆ str ┆ ┆ --- ┆ --- ┆ i64 ┆ --- │
│ ┆ ┆ ┆ ┆ ┆ f64 ┆ f64 ┆ ┆ str │
╞══════════╪═════════════╪══════════════════╪════════════════╪═══╪══════════════════╪══════════════════╪═════╪═════════════════╡
│ 08-10-23 ┆ 12:31:34 PM ┆ (Eastern ┆ 0000:15:00.000 ┆ … ┆ 39.096 ┆ 361.78 ┆ 98 ┆ 01:01:33 PM │
│ ┆ ┆ Standard Time) ┆ ┆ ┆ ┆ ┆ ┆ │
│ 08-10-23 ┆ 12:46:34 PM ┆ (Eastern ┆ 0000:30:00.000 ┆ … ┆ 38.907 ┆ 400.52 ┆ 194 ┆ 01:01:33 PM │
│ ┆ ┆ Standard Time) ┆ ┆ ┆ ┆ ┆ ┆ │
│ 08-10-23 ┆ 01:01:33 PM ┆ (Eastern ┆ 0000:45:00.000 ┆ … ┆ 38.822 ┆ 263.7 ┆ 290 ┆ 01:01:33 PM │
│ ┆ ┆ Standard Time) ┆ ┆ ┆ ┆ ┆ ┆ │
└──────────┴─────────────┴──────────────────┴────────────────┴───┴──────────────────┴──────────────────┴─────┴─────────────────┘
Saving this as a CSV file will show 8/10/2023
and 12:31:34 PM
if you select the "Save as shown" option.
Command:
Output:
INFO | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO | Loading file from: `example-as-shown.csv`
shape: (3, 12)
┌───────────┬─────────────┬─────────────────────────┬────────────────┬───┬──────┬───────────────┬─────┬───────────────────┐
│ Date ┆ Time ┆ TimeZone ┆ ElapsedTime ┆ … ┆ ┆ _duplicated_0 ┆ 2 ┆ Time_duplicated_0 │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ ┆ str ┆ str ┆ i64 ┆ str │
╞═══════════╪═════════════╪═════════════════════════╪════════════════╪═══╪══════╪═══════════════╪═════╪═══════════════════╡
│ 8/10/2023 ┆ 12:31:34 PM ┆ (Eastern Standard Time) ┆ 0000:15:00.000 ┆ … ┆ null ┆ null ┆ 98 ┆ 1:01:34 PM │
│ 8/10/2023 ┆ 12:46:34 PM ┆ (Eastern Standard Time) ┆ 0000:30:00.000 ┆ … ┆ null ┆ null ┆ 194 ┆ 1:01:34 PM │
│ 8/10/2023 ┆ 1:01:34 PM ┆ (Eastern Standard Time) ┆ 0000:45:00.000 ┆ … ┆ null ┆ null ┆ 290 ┆ 1:01:34 PM │
└───────────┴─────────────┴─────────────────────────┴────────────────┴───┴──────┴───────────────┴─────┴───────────────────┘
However, if you do not "Save as shown", then you would get
Command:
Output:
INFO | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO | Loading file from: `example-raw.csv`
shape: (3, 12)
┌────────────┬─────────────────────┬───────────────────┬────────────────┬───┬──────┬───────────────┬─────┬─────────────────────┐
│ Date ┆ Time ┆ TimeZone ┆ ElapsedTime ┆ … ┆ ┆ _duplicated_0 ┆ 2 ┆ Time_duplicated_0 │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ ┆ str ┆ str ┆ i64 ┆ str │
╞════════════╪═════════════════════╪═══════════════════╪════════════════╪═══╪══════╪═══════════════╪═════╪═════════════════════╡
│ 08/10/2023 ┆ 08/10/2023 12:31:34 ┆ (Eastern Standard ┆ 0000:15:00.000 ┆ … ┆ null ┆ null ┆ 98 ┆ 08/11/2023 13:01:34 │
│ 12:31:34 ┆ ┆ Time) ┆ ┆ ┆ ┆ ┆ ┆ │
│ 08/10/2023 ┆ 08/10/2023 12:46:34 ┆ (Eastern Standard ┆ 0000:30:00.000 ┆ … ┆ null ┆ null ┆ 194 ┆ 08/12/2023 13:01:34 │
│ 12:46:34 ┆ ┆ Time) ┆ ┆ ┆ ┆ ┆ ┆ │
│ 08/10/2023 ┆ 08/10/2023 13:01:34 ┆ (Eastern Standard ┆ 0000:45:00.000 ┆ … ┆ null ┆ null ┆ 290 ┆ 08/13/2023 13:01:34 │
│ 13:01:34 ┆ ┆ Time) ┆ ┆ ┆ ┆ ┆ ┆ │
└────────────┴─────────────────────┴───────────────────┴────────────────┴───┴──────┴───────────────┴─────┴─────────────────────┘
At the moment, we only support individual date and time columns; thus, let's just use the original Excel file.
Preparation¶
TODO:
vaxstats prep example.xlsx --date_idx 0 --time_idx 1 --y_idx 6 --input_date_fmt "%m-%d-%y" --input_time_fmt "%I:%M:%S %p" --output example_prepped.csv
INFO | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO | User selected `prep` command
INFO | Loading file from: `example.xlsx`
INFO | Cleaning DataFrame
INFO | Preparing DataFrame for statsforecast
INFO | Writing prepared CSV file to: `example_prepped.csv`
Forecast¶
TODO:
vaxstats forecast example_prepped.csv "statsforecast.models.ARIMA" \
--baseline_hours 48 --sf_model_args "()" \
--sf_model_kwargs "{'order': (0, 0, 10), 'seasonal_order': (0, 1, 1), 'season_length': 96, 'method': 'CSS-ML'}" \
--output_path forecasted.csv
INFO | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO | User selected `forecast` command
INFO | Loading file from: `example_prepped.csv`
INFO | Splitting DataFrame into 2
INFO | Fitting data
INFO | Elapsed time: 5 seconds
INFO | Elapsed time: 10 seconds
INFO | Elapsed time: 33 seconds
INFO | Finished in 33.43 seconds
INFO | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO | Loading file from: `forecasted.csv`
shape: (3, 4)
┌───────────┬─────────────────────┬────────┬───────────┬──────────┐
│ unique_id ┆ ds ┆ y ┆ y_hat ┆ residual │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i32 ┆ str ┆ f64 ┆ f64 ┆ f64 │
╞═══════════╪═════════════════════╪════════╪═══════════╪══════════╡
│ 0 ┆ 2023-08-10 12:31:34 ┆ 39.096 ┆ 39.056904 ┆ 0.039096 │
│ 0 ┆ 2023-08-10 12:46:34 ┆ 38.907 ┆ 38.868093 ┆ 0.038907 │
│ 0 ┆ 2023-08-10 13:01:33 ┆ 38.822 ┆ 38.783178 ┆ 0.038822 │
└───────────┴─────────────────────┴────────┴───────────┘──────────┘