RC Research
Can You Game the Reading Comprehension?
We tested four popular answer-choice heuristics against 10,340 real LSAT answer choices. The exam won.
LSAT students trade test-taking "tricks" like folk remedies: pick the longest answer. Avoid extreme language. The correct answer paraphrases; traps copy verbatim. These heuristics feel plausible. Some tutors teach them as legitimate strategies.
We ran four empirical analyses across every Reading Comprehension question in our database to find out if any of them actually work.
The short answer: no. But the details are interesting.
Analysis I
"Pick the longest answer"
The most persistent myth in standardized testing. Does it hold up?
Correct and incorrect answers are virtually identical in length. The difference is 0.2 words. The LSAT's item writers clearly control for answer length.
The shortest answer is slightly less likely to be correct (17.8% vs 20% expected). But the longest? Dead average. This heuristic gives you nothing.
Analysis II
"Pick the hedged, conservative answer"
Correct answers use "some" and "may." Wrong ones say "all" and "never." Right?
This one is technically true: correct answers do use slightly more hedging language. But the effect is so small it's useless in practice. Picking the "most conservative" answer gives you 23.1% accuracy — barely above the 20% random baseline.
One sub-type stands out: Local Inference questions show d = 0.25 (p < 0.001) — the only question type where the signal clears the "small effect" threshold.
Analysis III
"Correct answers paraphrase; traps copy verbatim"
Perhaps the most widely-taught LSAT heuristic. If a choice lifts exact words from the passage, it's a trap.
The paraphrase heuristic is not supported by the data. Correct and incorrect answers share almost exactly the same proportion of words with the passage. The direction is opposite to the myth: correct answers have marginally more overlap, not less.
Analysis IV
Semantic similarity to the passage
Beyond raw words: do correct answers live in a distinct semantic neighborhood?
We built TF-IDF vectors for all 10,650 documents (310 passages + 10,340 choices) and computed cosine similarity between each answer choice and its passage.
Overall: no signal. But broken down by question type, one standout emerges.
The one real finding
Global Main Point questions (n = 252) show a small-to-medium effect: correct answers are significantly more semantically similar to the passage (d = 0.37, p < 0.001). This makes intuitive sense — the main point should closely mirror the passage's vocabulary. But this is one question type out of twelve.
The full picture
| Heuristic | Cohen's d | p-value | Verdict |
|---|---|---|---|
| "Pick the longest" | 0.02 | 0.42 | Myth |
| "Pick the hedged/conservative" | 0.15 | < 0.001 | Negligible |
| "Paraphrase = correct" | 0.03 | 0.17 | Not confirmed |
| "Most similar to passage" | 0.06 | 0.01 | Negligible |
The LSAT's answer-choice engineering is remarkably good at neutralizing surface-level shortcuts.
None of the four heuristics produce an effect size above the 0.20 threshold for a "small" effect. The best performer — conservativeness — delivers 23.1% accuracy, barely clearing the 20% random guess rate.
The only reliable approach is understanding the passage.
Methodology
How we measured this
Dataset: 310 LSAT RC passages containing 2,068 questions with 5 answer choices each (10,340 total choices).
Conservativeness score: (hedge word frequency − extreme word frequency) / total words. Hedge words include some, may, might, could, often, generally, tends. Extreme words include all, every, always, never, none, only, must.
Word overlap: Overlap ratio = (choice content words found in passage) / (total choice content words). Jaccard similarity also measured. Stop words excluded.
TF-IDF similarity: Custom TF-IDF vectors across all 10,650 documents, L2-normalized, cosine similarity computed per choice-passage pair.
Statistical tests: Welch's t-test, Cohen's d. Thresholds: negligible (< 0.20), small (0.20–0.50), medium (0.50–0.80), large (> 0.80).