Skip to content

Example 02

While E01 demonstrates a Python-based approach, we demonstrate the command-line interface (CLI) option of vaxstats here.

DataFrame

We will assume you already have example.xlsx in your current work directory; if not, you can get it by running the following bash command.

wget https://github.com/oasci/vaxstats/raw/main/tests/files/example.xlsx

Verifying format

Excel often displays automatically formatted cells when it comes to dates and times. First, we will use vaxstats peak to examine how polars will read our file.

For example, if you open example.xlsx with Excel or LibreOffice Calc, you will see the first and second columns contain 8/10/2023 and 12:31:34 PM, respectively.

Command:

vaxstats peak example.xlsx

Output:

INFO     | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO     | Loading file from: `example.xlsx`
shape: (3, 10)
┌──────────┬─────────────┬──────────────────┬────────────────┬───┬──────────────────┬──────────────────┬─────┬─────────────────┐
│ Date     ┆ Time        ┆ TimeZone         ┆ ElapsedTime    ┆ … ┆ T_NPMN(Celsius): ┆ A_NPMN(Counts):A ┆ 2   ┆ Time_duplicated │
│ ---      ┆ ---         ┆ ---              ┆ ---            ┆   ┆ Temperat         ┆ ctivity          ┆ --- ┆ _0              │
│ str      ┆ str         ┆ str              ┆ str            ┆   ┆ ---              ┆ ---              ┆ i64 ┆ ---             │
│          ┆             ┆                  ┆                ┆   ┆ f64              ┆ f64              ┆     ┆ str             │
╞══════════╪═════════════╪══════════════════╪════════════════╪═══╪══════════════════╪══════════════════╪═════╪═════════════════╡
│ 08-10-23 ┆ 12:31:34 PM ┆ (Eastern         ┆ 0000:15:00.000 ┆ … ┆ 39.096           ┆ 361.78           ┆ 98  ┆ 01:01:33 PM     │
│          ┆             ┆ Standard Time)   ┆                ┆   ┆                  ┆                  ┆     ┆                 │
│ 08-10-23 ┆ 12:46:34 PM ┆ (Eastern         ┆ 0000:30:00.000 ┆ … ┆ 38.907           ┆ 400.52           ┆ 194 ┆ 01:01:33 PM     │
│          ┆             ┆ Standard Time)   ┆                ┆   ┆                  ┆                  ┆     ┆                 │
│ 08-10-23 ┆ 01:01:33 PM ┆ (Eastern         ┆ 0000:45:00.000 ┆ … ┆ 38.822           ┆ 263.7            ┆ 290 ┆ 01:01:33 PM     │
│          ┆             ┆ Standard Time)   ┆                ┆   ┆                  ┆                  ┆     ┆                 │
└──────────┴─────────────┴──────────────────┴────────────────┴───┴──────────────────┴──────────────────┴─────┴─────────────────┘

Saving this as a CSV file will show 8/10/2023 and 12:31:34 PM if you select the "Save as shown" option.

Command:

vaxstats peak example-as-shown.csv

Output:

INFO     | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO     | Loading file from: `example-as-shown.csv`
shape: (3, 12)
┌───────────┬─────────────┬─────────────────────────┬────────────────┬───┬──────┬───────────────┬─────┬───────────────────┐
│ Date      ┆ Time        ┆ TimeZone                ┆ ElapsedTime    ┆ … ┆      ┆ _duplicated_0 ┆ 2   ┆ Time_duplicated_0 │
│ ---       ┆ ---         ┆ ---                     ┆ ---            ┆   ┆ ---  ┆ ---           ┆ --- ┆ ---               │
│ str       ┆ str         ┆ str                     ┆ str            ┆   ┆ str  ┆ str           ┆ i64 ┆ str               │
╞═══════════╪═════════════╪═════════════════════════╪════════════════╪═══╪══════╪═══════════════╪═════╪═══════════════════╡
│ 8/10/2023 ┆ 12:31:34 PM ┆ (Eastern Standard Time) ┆ 0000:15:00.000 ┆ … ┆ null ┆ null          ┆ 98  ┆ 1:01:34 PM        │
│ 8/10/2023 ┆ 12:46:34 PM ┆ (Eastern Standard Time) ┆ 0000:30:00.000 ┆ … ┆ null ┆ null          ┆ 194 ┆ 1:01:34 PM        │
│ 8/10/2023 ┆ 1:01:34 PM  ┆ (Eastern Standard Time) ┆ 0000:45:00.000 ┆ … ┆ null ┆ null          ┆ 290 ┆ 1:01:34 PM        │
└───────────┴─────────────┴─────────────────────────┴────────────────┴───┴──────┴───────────────┴─────┴───────────────────┘

However, if you do not "Save as shown", then you would get

Command:

vaxstats peak example-raw.csv

Output:

INFO     | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO     | Loading file from: `example-raw.csv`
shape: (3, 12)
┌────────────┬─────────────────────┬───────────────────┬────────────────┬───┬──────┬───────────────┬─────┬─────────────────────┐
│ Date       ┆ Time                ┆ TimeZone          ┆ ElapsedTime    ┆ … ┆      ┆ _duplicated_0 ┆ 2   ┆ Time_duplicated_0   │
│ ---        ┆ ---                 ┆ ---               ┆ ---            ┆   ┆ ---  ┆ ---           ┆ --- ┆ ---                 │
│ str        ┆ str                 ┆ str               ┆ str            ┆   ┆ str  ┆ str           ┆ i64 ┆ str                 │
╞════════════╪═════════════════════╪═══════════════════╪════════════════╪═══╪══════╪═══════════════╪═════╪═════════════════════╡
│ 08/10/2023 ┆ 08/10/2023 12:31:34 ┆ (Eastern Standard ┆ 0000:15:00.000 ┆ … ┆ null ┆ null          ┆ 98  ┆ 08/11/2023 13:01:34 │
│ 12:31:34   ┆                     ┆ Time)             ┆                ┆   ┆      ┆               ┆     ┆                     │
│ 08/10/2023 ┆ 08/10/2023 12:46:34 ┆ (Eastern Standard ┆ 0000:30:00.000 ┆ … ┆ null ┆ null          ┆ 194 ┆ 08/12/2023 13:01:34 │
│ 12:46:34   ┆                     ┆ Time)             ┆                ┆   ┆      ┆               ┆     ┆                     │
│ 08/10/2023 ┆ 08/10/2023 13:01:34 ┆ (Eastern Standard ┆ 0000:45:00.000 ┆ … ┆ null ┆ null          ┆ 290 ┆ 08/13/2023 13:01:34 │
│ 13:01:34   ┆                     ┆ Time)             ┆                ┆   ┆      ┆               ┆     ┆                     │
└────────────┴─────────────────────┴───────────────────┴────────────────┴───┴──────┴───────────────┴─────┴─────────────────────┘

At the moment, we only support individual date and time columns; thus, let's just use the original Excel file.

Preparation

TODO:

vaxstats prep example.xlsx --date_idx 0 --time_idx 1 --y_idx 6 --input_date_fmt "%m-%d-%y" --input_time_fmt "%I:%M:%S %p" --output example_prepped.csv
INFO     | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO     | User selected `prep` command
INFO     | Loading file from: `example.xlsx`
INFO     | Cleaning DataFrame
INFO     | Preparing DataFrame for statsforecast
INFO     | Writing prepared CSV file to: `example_prepped.csv`

Forecast

TODO:

vaxstats forecast example_prepped.csv "statsforecast.models.ARIMA" \
--baseline_hours 48 --sf_model_args "()" \
--sf_model_kwargs "{'order': (0, 0, 10), 'seasonal_order': (0, 1, 1), 'season_length': 96, 'method': 'CSS-ML'}" \
--output_path forecasted.csv
INFO     | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO     | User selected `forecast` command
INFO     | Loading file from: `example_prepped.csv`
INFO     | Splitting DataFrame into 2
INFO     | Fitting data
INFO     | Elapsed time: 5 seconds
INFO     | Elapsed time: 10 seconds
INFO     | Elapsed time: 33 seconds
INFO     | Finished in 33.43 seconds
vaxstats peak forecasted.csv
INFO     | VaxStats v0.0.0 by OASCI <us@oasci.org>
INFO     | Loading file from: `forecasted.csv`
shape: (3, 4)
┌───────────┬─────────────────────┬────────┬───────────┬──────────┐
│ unique_id ┆ ds                  ┆ y      ┆ y_hat     ┆ residual │
│ ---       ┆ ---                 ┆ ---    ┆ ---       ┆ ---      │
│ i32       ┆ str                 ┆ f64    ┆ f64       ┆ f64      │
╞═══════════╪═════════════════════╪════════╪═══════════╪══════════╡
│ 0         ┆ 2023-08-10 12:31:34 ┆ 39.096 ┆ 39.056904 ┆ 0.039096 │
│ 0         ┆ 2023-08-10 12:46:34 ┆ 38.907 ┆ 38.868093 ┆ 0.038907 │
│ 0         ┆ 2023-08-10 13:01:33 ┆ 38.822 ┆ 38.783178 ┆ 0.038822 │
└───────────┴─────────────────────┴────────┴───────────┘──────────┘