Cleaning a Dataset
cleanDS.Rd
Cleaning a Dataset
Arguments
- x
the dataset to be cleaned
- cleanDF
a data.frame that includes instructions for cleaning x; columns at least should include the following columns:
Name
: Containing names of variables to be extracted,CurrentVal
: The list of valid valuesNewLevel
: Level names to be used for categorical variablesNA_Vals
: The values or levels to be replaced with an NA.Rename
: Suggested new variable names, if needed. IfCurrentVal
is given, all values not listed in it will be replaced by NA.
- Rename
a logical value indicating whether the variable names should be replaced with values in the
Rename
column, if given.- addCols
list of additional columns to be included without any change.
Examples
# Creating a sample cleanDF data frame for `mtccars` dataset:
cleanMTcars <- data.frame(
Name=c('vs', 'am', 'gear'),
CurrentVal=c('0,1','0,1',''),
NewLevel=c('V-shape,Straight','Manual,Automatic',''),
NA_Vals=c('','','5'),
Rename=c('Engine.Shape','Transmission', 'Forward.Gears'))
# Extracting Data
cleanDS(mtcars, cleanDF=cleanMTcars, Rename=TRUE) |> tail(10)
#> Engine.Shape Transmission Forward.Gears
#> AMC Javelin V-shape Manual 3
#> Camaro Z28 V-shape Manual 3
#> Pontiac Firebird V-shape Manual 3
#> Fiat X1-9 Straight Automatic 4
#> Porsche 914-2 V-shape Automatic NA
#> Lotus Europa Straight Automatic NA
#> Ford Pantera L V-shape Automatic NA
#> Ferrari Dino V-shape Automatic NA
#> Maserati Bora V-shape Automatic NA
#> Volvo 142E Straight Automatic 4
cleanDS(mtcars, cleanDF=cleanMTcars, Rename=FALSE, addCols=colnames(mtcars)) |> head(10)
#> vs am gear mpg cyl disp hp drat wt qsec
#> Mazda RX4 V-shape Automatic 4 21.0 6 160.0 110 3.90 2.620 16.46
#> Mazda RX4 Wag V-shape Automatic 4 21.0 6 160.0 110 3.90 2.875 17.02
#> Datsun 710 Straight Automatic 4 22.8 4 108.0 93 3.85 2.320 18.61
#> Hornet 4 Drive Straight Manual 3 21.4 6 258.0 110 3.08 3.215 19.44
#> Hornet Sportabout V-shape Manual 3 18.7 8 360.0 175 3.15 3.440 17.02
#> Valiant Straight Manual 3 18.1 6 225.0 105 2.76 3.460 20.22
#> Duster 360 V-shape Manual 3 14.3 8 360.0 245 3.21 3.570 15.84
#> Merc 240D Straight Manual 4 24.4 4 146.7 62 3.69 3.190 20.00
#> Merc 230 Straight Manual 4 22.8 4 140.8 95 3.92 3.150 22.90
#> Merc 280 Straight Manual 4 19.2 6 167.6 123 3.92 3.440 18.30
#> carb
#> Mazda RX4 4
#> Mazda RX4 Wag 4
#> Datsun 710 1
#> Hornet 4 Drive 1
#> Hornet Sportabout 2
#> Valiant 1
#> Duster 360 4
#> Merc 240D 2
#> Merc 230 2
#> Merc 280 4