Training Rounds¶

Planning document for the training rounds feature — admin-defined practice/training sessions that reviewers must complete before accessing the full study set for a stage.

Problem Statement¶

Project admins need to verify that reviewers understand the annotation questions and apply them consistently before those reviewers begin annotating real studies. Currently, there is no mechanism to:

Designate a subset of studies as "training" studies with known expected answers
Require reviewers to complete training before accessing the full study set
Evaluate reviewer performance against expected answers or inter-rater agreement
Gate access to the full study set based on training outcomes

Without training rounds, admins must rely on external processes (email instructions, separate training documents, manual checks) to verify reviewer readiness. This is error-prone and unscalable.

Core Concepts¶

Training Round¶

A training round is a configuration on a stage that defines:

A set of training studies (selected by the admin from the stage's study pool)
Expected answers (optional) — the admin's gold-standard answers for each training study, used to score reviewer performance
Pass criteria — the threshold(s) a reviewer must meet to pass (e.g., agreement percentage, specific questions that must match)
Attempt policy — whether reviewers can retry if they fail, and how many attempts are allowed

Training Session¶

A training session is a reviewer's attempt at the training round. It is a regular annotation session (using the same questions, same form, same SQS version) but:

Scoped to the training studies only
Marked as a training session (distinct from regular annotation sessions)
Evaluated against the pass criteria on completion
Does not count toward the stage's regular annotation data

Reviewer Eligibility¶

A reviewer's eligibility to annotate the full study set for a stage depends on:

Whether a training round is configured for the stage
Whether the reviewer has a passing training session
The admin can override eligibility manually (e.g., for experienced reviewers)

Relationship to Question Management v2¶

Training rounds use the same question set as the stage. Key intersections:

SQS Version Pinning¶

A training round should be pinned to a specific SQS version. If the admin updates questions on the stage (publishes a new SQS version), the training round has two options:

Stay pinned — reviewers who started training continue with the version they started on. New reviewers starting training after the publish also use the pinned version. The admin must explicitly update the training round to use the new SQS version.
Auto-update — the training round always uses the latest SQS version. This means in-progress training sessions may be affected by question changes (same breaking change logic as regular sessions).

Recommended default: Stay pinned. Training rounds are calibration exercises — changing the questions mid-training invalidates the calibration. The admin explicitly updates when ready.

Expected Answers and Question Versioning¶

Expected answers are tied to a specific AQ version. If the admin publishes a new version of a question (e.g., adds an option), the expected answer may need updating. The system should flag this: "Training round expected answers reference an older version of 2 questions. [Update expected answers →]"

Training Annotations and Version Tracking¶

Training annotations follow the same versioning model as regular annotations:

Each training annotation records QuestionVersionId
Training sessions have session versions (ASVs) when Phase 8 is active
Training data is kept separate from regular annotation data (flagged as training)

User Workflows¶

Admin: Configure Training Round¶

Navigate to stage settings → Training
Select training studies from the stage's study pool
Optionally provide expected answers for each study (the admin completes the annotation form for each training study as the "gold standard")
Set pass criteria:
Agreement threshold (e.g., 80% match with expected answers)
Or: specific questions that must match exactly
Or: minimum number of studies that must pass
Set attempt policy: unlimited retries, N retries, or single attempt
Activate the training round

Admin: Review Training Results¶

Navigate to stage settings → Training → Results
See a table of reviewers with their training status:
Not started / In progress / Passed / Failed (attempt N of M)
Drill into a reviewer's training session to see:
Per-study, per-question comparison of their answer vs expected answer
Overall score
Which questions they got wrong
Override eligibility manually if needed (e.g., grant access despite failing)

Reviewer: Complete Training¶

Open a stage that has a training round configured
See a message: "This stage requires training before you can begin reviewing. Complete the training studies to proceed."
The annotation form shows training studies only (same questions, same form)
Complete all training studies
On completion, see results: "You scored 85% — Training passed!" or "You scored 65% — Training not passed. You can try again."
If passed: the full study set becomes accessible
If failed: retry (if allowed) or contact admin

Scope and Phasing¶

Training rounds are independent of the QM v2 implementation phases but should be designed with version awareness:

Can be implemented before Phase 8: Training sessions use the same annotation storage as regular sessions. No dependency on annotation versioning.
Benefits from Phase 8: With annotation versioning, training session retries are preserved in history. Without it, retries overwrite previous attempts (same tree-shaking limitation as regular annotations).
Recommended timing: After Phases 1-7 (QM v2 core), before or alongside Phase 8. Could be a parallel workstream.

SQS Version Pinning Dependency¶

Training rounds need to reference a specific SQS version. This requires: - SQS versions to exist (Phase 1) - The ability to pin a training round to a version (new field on training round config) - Detection of stale expected answers when questions are updated

These are lightweight additions to the Phase 1 data model.

Open Questions¶

#	Question	Notes
1	Should training data be included in data exports?	Probably not by default, but should be available as a separate export for inter-rater reliability analysis.
2	Can a stage have multiple training rounds?	Use case: different training rounds for different reviewer cohorts, or a recalibration round mid-study. Probably yes, but v1 could support one per stage.
3	Should training rounds support screening as well as annotation?	The problem statement mentions screening. If yes, the training round needs to support both screening decisions and annotation forms.
4	How does reconciliation interact with training?	Training is individual — no reconciliation phase. But the admin might want to see inter-rater agreement across training sessions.
5	Should training studies be real studies or synthetic?	Real studies (subset of the pool) are simpler. Synthetic studies (admin-created) offer more control but require a study creation workflow. Start with real studies.
6	What happens to training data if the training round is reconfigured?	Previous training sessions should be preserved (historical record) but may be invalidated if questions change.