Io

`clean_df(df, col_idx=0)` ¶

Cleans a DataFrame by dropping rows with null values in the specified column.

PARAMETER	DESCRIPTION
`df`	The input DataFrame. TYPE: `DataFrame`
`col_idx`	The index of the column to check for null values. Defaults to 0. TYPE: `int` DEFAULT: `0`

RETURNS	DESCRIPTION
`DataFrame`	pl.DataFrame: A DataFrame with rows containing null values in the specified column removed.

RAISES	DESCRIPTION
`IndexError`	If col_idx is out of range of the DataFrame's columns.

Examples:

>>> import polars as pl
>>> data = {'a': [1, 2, None], 'b': [4, None, 6]}
>>> df = pl.DataFrame(data)
>>> clean_df(df)
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 4   │
│ 2   ┆ NaN │
└─────┴─────┘

`cli_peak(args)` ¶

`cli_prep(args)` ¶

Prepare data for analysis using command-line arguments.

This function loads a file, cleans the DataFrame, prepares it for forecasting, and saves the result to a CSV file.

PARAMETER DESCRIPTION

args

Command-line arguments parsed by argparse. Expected attributes:

file_path (str): Path to the input file.
date_idx (int): Index of the date column.
time_idx (int): Index of the time column.
y_idx (int): Index of the target variable column.
input_date_fmt (str): Format of the input date strings.
input_time_fmt (str): Format of the input time strings.
output_fmt (str): Format of the output datetime strings.
output (str): Name of the output CSV file.

TYPE: Namespace

RETURNS	DESCRIPTION
	None

`load_file(file_path, file_type=None, *args, **kwargs)` ¶

Loads a file into a Polars DataFrame.

PARAMETER	DESCRIPTION
`file_path`	The path to the file. TYPE: `str`
`file_type`	The type of file to load. Supported types are Excel and csv. TYPE: `str` DEFAULT: `None`
`*args`	Additional positional arguments to pass to the Polars file reading function. TYPE: `Any` DEFAULT: `()`
`**kwargs`	Additional keyword arguments to pass to the Polars file reading function. TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`DataFrame`	pl.DataFrame: The loaded DataFrame.

RAISES	DESCRIPTION
`TypeError`	If the `file_type` is not supported.

Examples:

>>> import polars as pl
>>> df = load_file("data.xlsx")
>>> df = load_file("data.csv", file_type="csv")

Notes

We use either read_excel or read_csv from polars to read files. Please refer to their respective documentation for args or kwargs that are available.

`prep_forecast_df(df, date_idx, time_idx, y_idx, input_date_fmt='%m-%d-%y', input_time_fmt='%I:%M:%S %p', output_fmt='%Y-%m-%d %H:%M:%S')` ¶

Prepares a DataFrame for forecasting by combining date and time columns, and formatting them.

PARAMETER	DESCRIPTION
`df`	The input DataFrame. TYPE: `DataFrame`
`date_idx`	The index of the date column. TYPE: `int`
`time_idx`	The index of the time column. TYPE: `int`
`y_idx`	The index of the target variable column. TYPE: `int`
`input_date_fmt`	The format of the input date strings. Defaults to "%m-%d-%y". TYPE: `str` DEFAULT: `'%m-%d-%y'`
`input_time_fmt`	The format of the input time strings. Defaults to "%I:%M:%S %p". TYPE: `str` DEFAULT: `'%I:%M:%S %p'`
`output_fmt`	The format of the output datetime strings. Defaults to "%Y-%m-%d %H:%M:%S". TYPE: `str` DEFAULT: `'%Y-%m-%d %H:%M:%S'`

RETURNS	DESCRIPTION
`DataFrame`	A DataFrame with a combined and formatted datetime column ready for forecasting.

IndexError: If any of date_idx, time_idx, or y_idx are out of range of the DataFrame's columns. ValueError: If the date and time strings do not match the specified formats.

Notes

If date_idx and time_idx are the same, we combine input_date_fmt and input_time_fmt and load from the specified column.

Examples:

import polars as pl data = {'date': ["01-01-23", "01-02-23"], 'time': ["01:00:00 PM", "02:00:00 PM"], 'y': [10, 20]} df = pl.DataFrame(data) prep_forecast_df(df, date_idx=0, time_idx=1, y_idx=2) shape: (2, 3) ┌─────────────────────┬───────┬─────────────┐ │ ds ┆ y ┆ unique_id │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════════════════════╪═══════╪═════════════╡ │ 2023-01-01 13:00:00 ┆ 10 ┆ 0 │ │ 2023-01-02 14:00:00 ┆ 20 ┆ 0 │ └─────────────────────┴───────┴─────────────┘

Io

clean_df(df, col_idx=0) ¶

cli_peak(args) ¶

cli_prep(args) ¶

load_file(file_path, file_type=None, *args, **kwargs) ¶

prep_forecast_df(df, date_idx, time_idx, y_idx, input_date_fmt='%m-%d-%y', input_time_fmt='%I:%M:%S %p', output_fmt='%Y-%m-%d %H:%M:%S') ¶

`clean_df(df, col_idx=0)` ¶

`cli_peak(args)` ¶

`cli_prep(args)` ¶

`load_file(file_path, file_type=None, *args, **kwargs)` ¶

`prep_forecast_df(df, date_idx, time_idx, y_idx, input_date_fmt='%m-%d-%y', input_time_fmt='%I:%M:%S %p', output_fmt='%Y-%m-%d %H:%M:%S')` ¶