Skip to contents

Dynamically partitions a dataset into training and validation subsets, allowing for evaluation of machine learning model performance across varying sample sizes.

Usage

progressive_splits(data, assessment_size = 0, start_size = 2)

Arguments

data

A data frame containing the dataset to be split.

assessment_size

A numeric value between 0 and 1 indicating the proportion of the dataset to be used as the validation set. Default is 0.2.

start_size

An integer indicating the initial size of the training set. Must be at least 1 and less than the number of rows in the dataset minus the size of the validation set. Default is 2.

Value

An object of class 'rset' containing the training and validation splits for each iteration of increasing training set size.

See also

vfold_cv for details on the underlying cross-validation method.

Examples

# Example usage:
data(iris)
splits <- progressive_splits(iris, assessment_size = 0.2, start_size = 10)