IRT Assessment System

3-Parameter Logistic Model · SAT / UTBK-SNBT Compatible · Score 0–1000
How it works: Upload your binary response matrix (examinees × items), configure scoring options, and click Run Analysis. The system fits the 3PL IRT model using iterative MLE/EAP, scales scores to 0–1000, and produces charts + downloadable reports — entirely in the browser.
3PL IRT Model

P(θ) = c + (1−c) / (1 + e−a(θ−b))
a = discrimination   b = difficulty   c = guessing

Question Weighting

w = a · (1 + |b|/3) · (1 − 2c)
Weights are normalized to sum to 1000.
Harder + more discriminating = higher weight.

Score Scaling
  • Perfect (all correct) → 1000
  • Near-perfect (1 wrong) → separated band below 1000
  • Others → linear map of θ to [0, minNearPerfect−1]
  • Zero raw score → lowest observed IRT score
Features
  • CSV upload or synthetic demo data
  • ICC, TIF, ability distribution charts
  • Per-question score matrix
  • Cronbach's alpha reliability
  • CSV download of all tables

Data Input

Upload Response File
Drop CSV here or click to browse
Rows = examinees · Columns = items (binary 0/1)
Expected format: first row = header (item names), first column optional ID. Values: 0 / 1 / blank (omit) / "e" (omit).
Generate Synthetic Demo Data
200 students
30 items
Simulates UTBK/SAT-style responses using true 3PL parameters.
~5% responses randomly omitted (NA). Examinees from N(0,1).

Analysis Configuration

Scoring Parameters
2 decimal places
Enable Near-Perfect Score Separation
900
30 pts
Calculate Per-Question Scores & Weights
Parameter Descriptions
Decimal Precision
Controls decimal places in all output scores. 2 = SAT-style (e.g. 756.34).
Theta Bounds
Clamps latent ability θ. Default ±3 covers 99.7% of standard normal. Widen to ±4 for more extreme spread.
Near-Perfect Separation
Examinees with exactly 1 wrong answer receive distinct scores in [minNP, 1000 − gap], ranked by θ. Prevents near-perfect bunching.
Per-Question Weights
IRT weight formula: w = a·(1+|b|/3)·(1−2c). Normalized to sum = 1000. Correct answer earns full weight; wrong earns partial credit proportional to guessing.

Score Results

Mean Score
Perfect (1000)
Near-Perfect
Cronbach α
Student Score Table
Rank ↕ Examinee Scaled Score ↕ Raw % θ Ability ↕ Percentile Status

Visualizations

Score Distribution
Raw vs Scaled
θ Distribution
Test Info (TIF)
ICC Browser
Score Compare
IRT Scaled Score vs Question-Based Weighted Score (diagonal = perfect agreement)

Item Analysis (3PL Parameters)

Item Parameter Estimates (a, b, c)
Item a (Discrimination) ↕ b (Difficulty) ↕ c (Guessing) ↕ p-value N Answered ICC
Difficulty (b) Distribution
Discrimination (a) Distribution

Question Weight Analysis

IRT-based weight: w = a·(1+|b|/3)·(1−2c), normalized so all weights sum to 1000. Higher-discrimination / harder questions are worth more.

Question Statistics & Weights
Item Weight ↕ Weight % p-value a (disc.) b (diff.) c (guess.) Mean Score
Weights by Item (sorted)
Weight vs Difficulty

Export Data

CSV Downloads
Methodology Reference
Scoring Method
3PL IRT with Constrained Scaling
IRT Model
3-Parameter Logistic (a, b, c)
Score Range
0 – 1000
Perfect Score
Exactly 1000 (all correct)
Near-Perfect Handling
Separated in [minNP, 1000−gap] by θ rank
Theta Estimation
EAP / MLE approximation (browser JS)
Missing Data
Omitted (NA) — ignored in likelihood
Reliability
Cronbach's α (covariance formula)
Note: Because this runs in the browser without the R mirt package, item parameters are estimated using a simplified EM-based 3PL algorithm. Results match mirt closely for well-conditioned datasets. For very small samples (<50) or items with extreme p-values, results may differ from full mirt output.
ICC