% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/agree.R
\name{agree_tab}
\alias{agree_tab}
\title{Agreement for multiple items}
\usage{
agree_tab(
  data,
  cols,
  coders,
  ids = NULL,
  category = NULL,
  method = "reliability",
  labels = TRUE,
  clean = TRUE,
  ...
)
}
\arguments{
\item{data}{A tibble containing item measures, coders and case IDs.}

\item{cols}{A tidy selection of item variables (e.g. starts_with...) with ratings.}

\item{coders}{The column holding coders or methods to compare.}

\item{ids}{The column with case IDs.}

\item{category}{For classification performance indicators, if no category is provided,
macro statistics are returned (along with the number of categories in the output).
Provide a category to get the statistics for this category only.
If values are boolean (TRUE / FALSE) and no category is provided,
the category is always assumed to be "TRUE".}

\item{method}{The output metrics, one of \code{reliability} or \code{classification}.
You can abbreviate it, e.g. \code{reli} or \code{class}.}

\item{labels}{If TRUE (default) extracts labels from the attributes, see \link{codebook}.}

\item{clean}{Prepare data by \link{data_clean}.}

\item{...}{Placeholder to allow calling the method with unused parameters from \link{report_counts}.}
}
\value{
A volker tibble with one row for each item.
The item name is returned in the first column.
For the reliability method, the following columns are returned:
\itemize{
\item \strong{n}: Number of cases (each case id is only counted once).
\item \strong{Coders}: Number of coders.
\item \strong{Categories}: Number of categories.
\item \strong{Holsti}: Percent agreement (same as accuracy).
\item \strong{Krippendorff' Alpha}: Chance-corrected reliability score.
\item \strong{Kappa}: Depending on the number of coders either Cohen's Kappa (two coders) or Fleiss' Kappa (more coders).
\item \strong{Gwet's AC1}: Gwet's agreement coefficient.
}

For the classification method, the following columns are returned:
\itemize{
\item \strong{n}: Number of cases (each case id is only counted once)
\item \strong{Categories}: Number of categories
\item \strong{Accuracy}: Share of correct classifications.
\item \strong{Precision}: Share of true cases in all detected true cases.
\item \strong{Recall}: Share of true cases detected from all true cases.
\item \strong{F1}: Harmonic mean of precision and recall.
}
}
\description{
Two types of comparing categories are provided:
}
\details{
\itemize{
\item Reliability: Compare codings of two or more raters in content analysis.
Common reliability measures are percent agreement (also known as Holsti),
Fleiss' or Cohen's Kappa, Krippendorff's Alpha and Gwets AC.
\item Classification: Compare true and predicted categories from classification methods.
Common performance metrics include accuracy, precision, recall and F1.
}
}
\examples{
library(dplyr)
library(volker)

data <- volker::chatgpt

# Prepare example data.
# First, recode "x" to TRUE/FALSE for the first coder's sample.
data_coder1 <- data |>
  mutate(across(starts_with("cg_act_"), ~ ifelse(is.na(.), FALSE, TRUE))) \%>\%
  mutate(coder = "coder one")

# Second, recode using a dictionary approach for the second coder's sample.
data_coder2 <- data |>
  mutate(across(starts_with("cg_act_"), ~ ifelse(is.na(.), FALSE, TRUE))) \%>\%
  mutate(cg_act_write = grepl("write|text|translate", tolower(cg_activities))) \%>\%
  mutate(coder="coder two")

data_coded <- bind_rows(
  data_coder1,
  data_coder2
)

# Reliability coefficients are strictly only appropriate for manual codings
agree_tab(data_coded, cg_act_write,  coder, case, method = "reli")

# Better use classification performance indicators to compare the
# dictionary approach with human coding
agree_tab(data_coded, cg_act_write,  coder, case, method = "class")

}
\keyword{internal}
