Skip to content

Training Rounds

Planning document for the training rounds feature — admin-defined practice/training sessions that reviewers must complete before accessing the full study set for a stage.


Problem Statement

Project admins need to verify that reviewers understand the annotation questions and apply them consistently before those reviewers begin annotating real studies. Currently, there is no mechanism to:

  1. Designate a subset of studies as "training" studies with known expected answers
  2. Require reviewers to complete training before accessing the full study set
  3. Evaluate reviewer performance against expected answers or inter-rater agreement
  4. Gate access to the full study set based on training outcomes

Without training rounds, admins must rely on external processes (email instructions, separate training documents, manual checks) to verify reviewer readiness. This is error-prone and unscalable.


Core Concepts

Training Round

A training round is a configuration on a stage that defines:

  • A set of training studies (selected by the admin from the stage's study pool)
  • Expected answers (optional) — the admin's gold-standard answers for each training study, used to score reviewer performance
  • Pass criteria — the threshold(s) a reviewer must meet to pass (e.g., agreement percentage, specific questions that must match)
  • Attempt policy — whether reviewers can retry if they fail, and how many attempts are allowed

Training Session

A training session is a reviewer's attempt at the training round. It is a regular annotation session (using the same questions, same form, same SQS version) but:

  • Scoped to the training studies only
  • Marked as a training session (distinct from regular annotation sessions)
  • Evaluated against the pass criteria on completion
  • Does not count toward the stage's regular annotation data

Reviewer Eligibility

A reviewer's eligibility to annotate the full study set for a stage depends on:

  • Whether a training round is configured for the stage
  • Whether the reviewer has a passing training session
  • The admin can override eligibility manually (e.g., for experienced reviewers)

Relationship to Question Management v2

Training rounds use the same question set as the stage. Key intersections:

SQS Version Pinning

A training round should be pinned to a specific SQS version. If the admin updates questions on the stage (publishes a new SQS version), the training round has two options:

  1. Stay pinned — reviewers who started training continue with the version they started on. New reviewers starting training after the publish also use the pinned version. The admin must explicitly update the training round to use the new SQS version.

  2. Auto-update — the training round always uses the latest SQS version. This means in-progress training sessions may be affected by question changes (same breaking change logic as regular sessions).

Recommended default: Stay pinned. Training rounds are calibration exercises — changing the questions mid-training invalidates the calibration. The admin explicitly updates when ready.

Expected Answers and Question Versioning

Expected answers are tied to a specific AQ version. If the admin publishes a new version of a question (e.g., adds an option), the expected answer may need updating. The system should flag this: "Training round expected answers reference an older version of 2 questions. [Update expected answers →]"

Training Annotations and Version Tracking

Training annotations follow the same versioning model as regular annotations:

  • Each training annotation records QuestionVersionId
  • Training sessions have session versions (ASVs) when Phase 8 is active
  • Training data is kept separate from regular annotation data (flagged as training)

User Workflows

Admin: Configure Training Round

  1. Navigate to stage settings → Training
  2. Select training studies from the stage's study pool
  3. Optionally provide expected answers for each study (the admin completes the annotation form for each training study as the "gold standard")
  4. Set pass criteria:
  5. Agreement threshold (e.g., 80% match with expected answers)
  6. Or: specific questions that must match exactly
  7. Or: minimum number of studies that must pass
  8. Set attempt policy: unlimited retries, N retries, or single attempt
  9. Activate the training round

Admin: Review Training Results

  1. Navigate to stage settings → Training → Results
  2. See a table of reviewers with their training status:
  3. Not started / In progress / Passed / Failed (attempt N of M)
  4. Drill into a reviewer's training session to see:
  5. Per-study, per-question comparison of their answer vs expected answer
  6. Overall score
  7. Which questions they got wrong
  8. Override eligibility manually if needed (e.g., grant access despite failing)

Reviewer: Complete Training

  1. Open a stage that has a training round configured
  2. See a message: "This stage requires training before you can begin reviewing. Complete the training studies to proceed."
  3. The annotation form shows training studies only (same questions, same form)
  4. Complete all training studies
  5. On completion, see results: "You scored 85% — Training passed!" or "You scored 65% — Training not passed. You can try again."
  6. If passed: the full study set becomes accessible
  7. If failed: retry (if allowed) or contact admin

Scope and Phasing

Training rounds are independent of the QM v2 implementation phases but should be designed with version awareness:

  • Can be implemented before Phase 8: Training sessions use the same annotation storage as regular sessions. No dependency on annotation versioning.
  • Benefits from Phase 8: With annotation versioning, training session retries are preserved in history. Without it, retries overwrite previous attempts (same tree-shaking limitation as regular annotations).
  • Recommended timing: After Phases 1-7 (QM v2 core), before or alongside Phase 8. Could be a parallel workstream.

SQS Version Pinning Dependency

Training rounds need to reference a specific SQS version. This requires: - SQS versions to exist (Phase 1) - The ability to pin a training round to a version (new field on training round config) - Detection of stale expected answers when questions are updated

These are lightweight additions to the Phase 1 data model.


Open Questions

# Question Notes
1 Should training data be included in data exports? Probably not by default, but should be available as a separate export for inter-rater reliability analysis.
2 Can a stage have multiple training rounds? Use case: different training rounds for different reviewer cohorts, or a recalibration round mid-study. Probably yes, but v1 could support one per stage.
3 Should training rounds support screening as well as annotation? The problem statement mentions screening. If yes, the training round needs to support both screening decisions and annotation forms.
4 How does reconciliation interact with training? Training is individual — no reconciliation phase. But the admin might want to see inter-rater agreement across training sessions.
5 Should training studies be real studies or synthetic? Real studies (subset of the pool) are simpler. Synthetic studies (admin-created) offer more control but require a study creation workflow. Start with real studies.
6 What happens to training data if the training round is reconfigured? Previous training sessions should be preserved (historical record) but may be invalidated if questions change.