Package 'aai' reference manual

Title:	Functions, apps, exercises and other R related stuff used in "AI - Aalborg Intelligence"
Description:	Functions, apps, exercises and other R related stuff used in "AI - Aalborg Intelligence" The project (2020 - 2026) is supported by the Novo Nordisk Foundation to develop teaching material to be used in the Danish highschools to strengthen the understanding of AI while explaining how basic maths is used in the some popular AI methods.
Authors:	Ege Rubak, Torben Tvedebrink, Mikkel Meyer Andersen, Lisbeth Fajstrup
Maintainer:	Ege Rubak <[email protected]>
License:	MIT + file LICENSE
Version:	0.2.0
Built:	2025-02-10 04:29:23 UTC
Source:	https://github.com/aalborg-intelligence/aai

aai: Functions, apps, exercises and other R related stuff used in "AI - Aalborg Intelligence"

Description

Functions, apps, exercises and other R related stuff used in "AI - Aalborg Intelligence" The project (2020 - 2026) is supported by the Novo Nordisk Foundation to develop teaching material to be used in the Danish high schools to strengthen the understanding of AI while explaining how basic maths is used in the some popular AI methods.

Fictive data set used to demonstrate some concepts on classification

Description

A data frame containing the length add weight of SOMETHING. In 'classification_train_data' the class is also given.

Usage

classification_test_data
classification_test_data

Format

A data frame with 10 rows and 2 variables:

Længde: Length
Vægt: Weight

Fictive data set used to demonstrate some concepts on classification

Description

A data frame containing the length add weight of SOMETHING. In 'classification_train_data' the class is also given.

Usage

classification_train_data
classification_train_data

Format

A data frame with 150 rows and 3 variables:

Længde: Length
Vægt: Weight
Type: Class

Simple function for DT output

Description

Simple function for DT output

Usage

dt_simple(tab, ...)
dt_simple(tab, ...)

Arguments

`tab`	The table to format
`...`	Arguments passed to 'DT::datatable'

Short function for DT output

Description

Short function for DT output

Usage

dt_table(tab, ...)
dt_table(tab, ...)

Arguments

`tab`	The table to format
`...`	Arguments passed to 'DT::datatable'

Simple function for kable output

Description

Simple function for kable output

Usage

kable_(tab, ...)
kable_(tab, ...)

Arguments

`tab`	The table to format
`...`	Arguments passed to 'knitr::kable'

Data for plotting a grid based the mean of the K nearest neighbors

Description

Data for plotting a grid based the mean of the K nearest neighbors

Usage

kMD_plot(K = 3, .train, response = "Type", grid = 100)
kMD_plot(K = 3, .train, response = "Type", grid = 100)

Arguments

`K`	Number of neighbors
`.train`	Training data
`response`	Name of the class variable
`grid`	Resolution of the grid - higher values gives finer grid

Wrapper around 'class::knn'

Description

Wrapper around 'class::knn'

Usage

kNN(K = 3, .test, .train, response = "Type")
kNN(K = 3, .test, .train, response = "Type")

Arguments

`K`	Number of nearest neighbors to use
`.test`	Data that should be classified based on the training data
`.train`	Annotated training data that should be classified the test data
`response`	Name of the response variable

Visualise a kNN trainer

Description

Visualise a kNN trainer

Usage

kNN_plot(K = 3, .train, response = "Type", grid = 100)
kNN_plot(K = 3, .train, response = "Type", grid = 100)

Arguments

`K`	Number of neighbors to use
`.train`	The training data
`response`	The name of the response/class variable
`grid`	The resolution of the grid. Larger numbers gives higher resolution (and slower performance).

Examples

k <- 3
kNN_plot(.train = classification_train_data, K = k) %>%
ggplot() + labs(title = paste("K =", k)) +
geom_rect(aes(xmin = Længde_0, xmax = Længde_1, ymin = Vægt_0, ymax = Vægt_1, fill = Type), alpha = 0.3) +
geom_point(data = train, aes(x = Længde, y = Vægt, colour = Type))
k <- 3
kNN_plot(.train = classification_train_data, K = k) %>%
ggplot() + labs(title = paste("K =", k)) +
geom_rect(aes(xmin = Længde_0, xmax = Længde_1, ymin = Vægt_0, ymax = Vægt_1, fill = Type), alpha = 0.3) +
geom_point(data = train, aes(x = Længde, y = Vægt, colour = Type))

Actual cross-validation function for kNN.

Description

Actual cross-validation function for kNN.

Usage

kNN.cv(K = 3, .train, response = "Type", fold = 10)
kNN.cv(K = 3, .train, response = "Type", fold = 10)

Arguments

`K`	Vector of nearest neighbor values (the k in kNN)
`.train`	The data to use kNN on
`response`	The variable name of the response
`fold`	The number of folds to use in cross validation

Examples

data(classification_train_data)
K_LOO <- tibble(K = 1:15,
LOO = kNN.loo(K, .train = classification_train_data)
) %>%
rowwise() %>%
mutate(CV = list(kNN.cv(K, .train = classification_train_data)))

K_LOO %>% ggplot(aes(x = factor(K))) +
geom_boxplot(data = unnest(K_LOO, CV), aes(y = CV)) +
geom_point(aes(y = LOO), colour = "#999999") +
labs(x = "K", y = "Accuracy")
data(classification_train_data)
K_LOO <- tibble(K = 1:15,
LOO = kNN.loo(K, .train = classification_train_data)
) %>%
rowwise() %>%
mutate(CV = list(kNN.cv(K, .train = classification_train_data)))

K_LOO %>% ggplot(aes(x = factor(K))) +
geom_boxplot(data = unnest(K_LOO, CV), aes(y = CV)) +
geom_point(aes(y = LOO), colour = "#999999") +
labs(x = "K", y = "Accuracy")

Wrapper around 'class::knn.cv' which does Leave one Out (LoO)

Description

Wrapper around 'class::knn.cv' which does Leave one Out (LoO)

Usage

kNN.loo(K = 3, .train, response = "Type")
kNN.loo(K = 3, .train, response = "Type")

Arguments

`K`	Number of nearest neighbors to use (can be a vector)
`.train`	Annotated training data that should be classified the test data
`response`	Name of the response variable

Wrapper around 'class::knn1'

Description

Wrapper around 'class::knn1'

Usage

kNN1(.test, .train, response = "Type")
kNN1(.test, .train, response = "Type")

Arguments

`.test`	Data that should be classified based on the training data
`.train`	Annotated training data that should be classified the test data
`response`	Name of the response variable

Mean distance to k nearest

Description

Mean distance to k nearest

Usage

meandist_to_k_nearest(
  K = 3,
  .test,
  .train,
  response = "Type",
  dist = FALSE,
  info = TRUE
)
meandist_to_k_nearest(
  K = 3,
  .test,
  .train,
  response = "Type",
  dist = FALSE,
  info = TRUE
)

Arguments

`K`	Number of nearest neighbors
`.train`	The training data
`return_all`	Logical. Should the distance to the nearest K be returned or just the mean distance of them?

Value

If 'return_all = FALSE' a dataframe of the mean distance to each class of 'response' is returned. If 'return_all = TRUE' a list is returned - 'top_K' is as above, 'all' contains the closest neighbors from each class.

Examples

data(classification_train_data)
meandist_to_k_nearest_(K = 3, .train = classification_train_data) %>%
  mutate(same_Type = ifelse(obs_Type == Type, "Y", "N")) %>%
  ggplot(aes(x = obs_Type, y = Distance, fill = Type, colour = same_Type)) +
  labs(x = "Type of the observation", fill = "Type of the nearest points") +
  theme(legend.position = "top") +
  guides(colour = FALSE) + scale_colour_manual(values = c("Y" = "#666666", "N" = "#000000")) +
  geom_boxplot() + coord_flip()
data(classification_train_data)
meandist_to_k_nearest_(K = 3, .train = classification_train_data) %>%
  mutate(same_Type = ifelse(obs_Type == Type, "Y", "N")) %>%
  ggplot(aes(x = obs_Type, y = Distance, fill = Type, colour = same_Type)) +
  labs(x = "Type of the observation", fill = "Type of the nearest points") +
  theme(legend.position = "top") +
  guides(colour = FALSE) + scale_colour_manual(values = c("Y" = "#666666", "N" = "#000000")) +
  geom_boxplot() + coord_flip()

Mean distance to k nearest

Description

Mean distance to k nearest

Usage

meandist_to_k_nearest_(K = 5, .train, response = "Type", return_all = FALSE)
meandist_to_k_nearest_(K = 5, .train, response = "Type", return_all = FALSE)

Arguments

`K`	Number of nearest neighbors
`.train`	The training data
`return_all`	Logical. Should the distance to the nearest K be returned or just the mean distance of them?

Value

Fictive data set used to demonstrate some concepts in perceptron document

Description

A data frame containing the responses to two fictive questions on the scale -2,-1,0,1,2 together with a classification color.

Usage

perceptron31
perceptron31

Format

A data frame with 31 rows and 3 variables:

x1: Answer to first question.
x2: Answer to second question.
col: Class

Helper function for making preditive grid

Description

Helper function for making preditive grid

Usage

pred_grid(
  data,
  step = 10,
  response = "Type",
  pred_var = "Prediction",
  center = 0
)
pred_grid(
  data,
  step = 10,
  response = "Type",
  pred_var = "Prediction",
  center = 0
)

Arguments

`data`	Dataset
`step`	Step size in each data variable
`response`	The name of the response variable
`center`	If not through zero, then through 'center'

Method for predicting the majority vote or "?" if ties

Description

Method for predicting the majority vote or "?" if ties

Usage

pred_max(n, x)
pred_max(n, x)

Arguments

`n`	Counts
`x`	Data vector

Helper function for making preditive grid

Description

Helper function for making preditive grid

Usage

pred_plot_grid(pred_grd, pred_var = "Prediction", remove = TRUE)
pred_plot_grid(pred_grd, pred_var = "Prediction", remove = TRUE)

Arguments

`pred_grd`	Output from YY function
`remove`	Is parsed to the 'remove' argument of 'tidyr::separate()'

Create grid for new data

Description

Create grid for new data

Usage

predict_grid(pred_grd, newdata)
predict_grid(pred_grd, newdata)

Arguments

`pred_grd`	Returned from ?
`newdata`	New data to be used in prediction

Makes print return all rows in a tibble

Description

Makes print return all rows in a tibble

Usage

Print(...)
Print(...)

Arguments

...

Arguments passed to 'knitr::kable'

Create discretised version with some pretty labels

Description

Create discretised version with some pretty labels

Usage

seq_cut(x, step, center, breaks = FALSE)
seq_cut(x, step, center, breaks = FALSE)

Arguments

`x`	Data variable
`step`	Step size
`center`	If not through zero, then through 'center'
`breaks`	Logical. Should break labels we returned?

Examples

seq_cut(rnorm(100), step = 2, center = 0, breaks = TRUE)
seq_cut(rnorm(100), step = 2, center = 0, breaks = TRUE)

Create breaks for 'seq_cut'

Description

Create breaks for 'seq_cut'

Usage

seq_zero(x, step, center)
seq_zero(x, step, center)

Arguments

`x`	Data variable
`step`	Step size
`center`	If not through zero, then through 'center'

Examples

seq_zero(rnorm(100), step = 2, center = 0)
seq_zero(rnorm(100), step = 2, center = 0)

Plot of data for exercise by Jan B Sørensen on classification

Description

Plot of data for exercise by Jan B Sørensen on classification

Usage

xy_plot(train, x, y, colour, test = NULL, selected = NULL)
xy_plot(train, x, y, colour, test = NULL, selected = NULL)

Arguments

`train`	Training data set
`x`, `y`, `colour`	parameters controlling the x and y axis and point colours
`test`	Test data set
`selected`	points to highlight

Examples

data(classification_train_data)
data(classification_test_data)
type_cols <- c("1" = "#E41A1C", "2" = "#377EB8", "3" = "#4DAF4A", "?" = "#444444")
xy_plot(train = classification_train_data, x = Længde, y = Vægt, colour = Type) +
  scale_colour_manual(values = type_cols)
data(classification_train_data)
data(classification_test_data)
type_cols <- c("1" = "#E41A1C", "2" = "#377EB8", "3" = "#4DAF4A", "?" = "#444444")
xy_plot(train = classification_train_data, x = Længde, y = Vægt, colour = Type) +
  scale_colour_manual(values = type_cols)

Package 'aai'

Help Index

aai: Functions, apps, exercises and other R related stuff used in "AI - Aalborg Intelligence"

Description

Fictive data set used to demonstrate some concepts on classification

Description

Usage

Format

Fictive data set used to demonstrate some concepts on classification

Description

Usage

Format

Simple function for DT output

Description

Usage

Arguments

Short function for DT output

Description

Usage

Arguments

Simple function for kable output

Description

Usage

Arguments

Data for plotting a grid based the mean of the K nearest neighbors

Description

Usage

Arguments

Wrapper around 'class::knn'

Description

Usage

Arguments

Visualise a kNN trainer

Description

Usage

Arguments

Examples

Actual cross-validation function for kNN.

Description

Usage

Arguments

Examples

Wrapper around 'class::knn.cv' which does Leave one Out (LoO)

Description

Usage

Arguments

Wrapper around 'class::knn1'

Description

Usage

Arguments

Mean distance to k nearest

Description

Usage

Arguments

Value

Examples

Mean distance to k nearest

Description

Usage

Arguments

Value

Fictive data set used to demonstrate some concepts in perceptron document

Description

Usage

Format

Helper function for making preditive grid

Description

Usage

Arguments

Method for predicting the majority vote or "?" if ties

Description

Usage

Arguments

Helper function for making preditive grid

Description

Usage

Arguments

Create grid for new data

Description

Usage