The data set PimaIndiansDiabetes2 contains a corrected version of the original data set. While the UCI repository index claims that there are no missing values, closer inspection of the data shows several physical impossibilities, e.g., blood pressure or body mass index of 0. In PimaIndiansDiabetes2, all zero values of glucose, pressure, triceps, insulin and mass have been set to NA, see also Wahba et al (1995) and Ripley (1996).

PimaIndiansDiabetes_wide

Format

A data frame with 392 observations of 8 numeric variables, and target factor diabetes.

  • pregnant, Number of times pregnant

  • glucose, Plasma glucose concentration (glucose tolerance test)

  • pressure, Diastolic blood pressure (mm Hg)

  • triceps, Triceps skin fold thickness (mm)

  • insulin, 2-Hour serum insulin (mu U/ml)

  • mass, Body mass index (weight in kg/(height in m)\^2)

  • pedigree, Diabetes pedigree function

  • age, Age (years)

  • diabetes, Class variable (test for diabetes), either "pos" or "neg"

Details

This is a cleaned subset of mlbench's PimaIndiansDiabetes2. See help(PimaIndiansDiabetes2, package = "mlbench").

Replicating this dataset:

require("mlbench")
data(PimaIndiansDiabetes2)

d <- PimaIndiansDiabetes2
d <- d[complete.cases(d), ] ## Remove ~350 row-wise incomplete rows
PimaIndiansDiabetes_wide <- d
## save(PimaIndiansDiabetes_wide, file = "./data/PimaIndiansDiabetes_wide.rda")

Examples

library(spinifex)
str(PimaIndiansDiabetes_wide)
#> 'data.frame':	392 obs. of  9 variables:
#>  $ pregnant: num  1 0 3 2 1 5 0 1 1 3 ...
#>  $ glucose : num  89 137 78 197 189 166 118 103 115 126 ...
#>  $ pressure: num  66 40 50 70 60 72 84 30 70 88 ...
#>  $ triceps : num  23 35 32 45 23 19 47 38 30 41 ...
#>  $ insulin : num  94 168 88 543 846 175 230 83 96 235 ...
#>  $ mass    : num  28.1 43.1 31 30.5 30.1 25.8 45.8 43.3 34.6 39.3 ...
#>  $ pedigree: num  0.167 2.288 0.248 0.158 0.398 ...
#>  $ age     : num  21 33 26 53 59 51 31 33 32 27 ...
#>  $ diabetes: Factor w/ 2 levels "neg","pos": 1 2 2 2 2 2 2 1 2 1 ...
dat  <- scale_sd(PimaIndiansDiabetes_wide[, 1:8])
clas <- PimaIndiansDiabetes_wide$diabetes

bas <- basis_pca(dat)
mv  <- manip_var_of(bas)
mt  <- manual_tour(bas, mv)

ggt <- ggtour(mt, dat, angle = .2) +
  proto_default(aes_args = list(color = clas, shape = clas))
# \donttest{
animate_plotly(ggt)
# }