Measures and constructs
The construct measured is chronotype: the individual circadian preference for activity at a particular time of day. Chronotype is operationalized in chronobiology research as a continuous trait spanning an evening–morning axis, with categorical labels typically applied at the tails for descriptive convenience. The trait is largely heritable (40–50% in twin studies), strongly modulated by age (shift toward morningness with age), and only partially modifiable through behavioral intervention.
Chronotype is measured using self-report instruments. The validated gold-standard is the Morningness-Eveningness Questionnaire (MEQ-19) developed by Horne and Östberg in 1976. The MEQ-19 has been cited in over 4,000 scientific publications and has been validated against objective circadian markers including dim light melatonin onset (DLMO), core body temperature minimum, and cortisol peak. Burgess et al. 2015 (PMC4580371) reported correlations between MEQ score and DLMO ranging from r = -0.41 to -0.76 across studies — solid convergent validity for a self-report instrument compared to a measurement requiring laboratory blood sampling.
The Munich ChronoType Questionnaire (MCTQ; Roenneberg, Wirz-Justice & Merrow, 2003) is the second-most-cited chronotype instrument and uses a different conceptualization — it measures mid-sleep on free days as the chronotype proxy rather than asking about preferences directly. The MCTQ has its own strengths (notably, more direct connection to circadian phase) but the MEQ-19 framework remains the field standard for cross-study comparability and is the basis for this implementation.
Instrument structure
The instrument consists of 19 items covering five functional domains of chronotype, with a single demographic input (age) used solely to determine which categorization to display as primary.
Domain coverage of the 19 items
| Domain | Items | Construct |
|---|---|---|
| Preferred timing | Q1, Q2, Q8, Q10, Q17 | When user would naturally sleep, wake, peak, and tire |
| Morning function | Q3, Q4, Q5, Q6, Q7 | Morning alertness, ease of waking, hunger, tiredness |
| Peak performance | Q11, Q15, Q18 | Mental peak, physical peak, absolute best time |
| Behavioral preference under choice | Q9, Q13, Q14, Q16 | Exercise timing, late-night party adaptation, forced-wake response |
| Self-identification | Q12, Q19 | 11 PM tiredness, lark/owl self-label |
Response option count by item
Items have either 4 or 5 response options. The distribution is calibrated to match the original MEQ-19: items related to preferred timing typically have 5 options (covering the full timing range), items related to function and behavior typically have 4 options (Likert-style preference scales). Each option carries an integer point value; the sum of points across all 19 items produces the total chronotype score.
Demographic input
A single demographic input (age, integer 14–110) is collected. Age is used only to determine whether the result displays the Horne & Östberg (1976) classification only (for users under 40) or includes the Taillard et al. (2004) middle-age classification as a secondary view (for users 40+). Age does not affect the score itself or the primary classification. The age input is not stored.
Scoring algorithm — full pseudocode
Item-level point values
// Point values per response option (zero-indexed by option position)
const ITEM_OPTIONS = {
1: [5, 4, 3, 2, 1], // Preferred wake time (5 options, 1-5)
2: [5, 4, 3, 2, 1], // Preferred bed time (5 options, 1-5)
3: [4, 3, 2, 1], // Alarm dependence (4 options, 1-4)
4: [4, 3, 2, 1], // Morning ease
5: [4, 3, 2, 1], // Morning alertness
6: [4, 3, 2, 1], // Morning hunger
7: [1, 2, 3, 4], // Morning tiredness (REVERSED)
8: [4, 3, 2, 1], // Bedtime free-day
9: [4, 3, 2, 1], // Morning exercise preference
10: [5, 4, 3, 2, 1], // Evening tiredness onset (5 options)
11: [6, 4, 2, 0], // Mental peak (4 options, 0-6)
12: [5, 3, 2, 0], // 11 PM tiredness (4 options, 0-5)
13: [4, 3, 2, 1], // Late-night party
14: [1, 2, 3, 4], // Forced wake (REVERSED)
15: [4, 3, 2, 1], // Heavy physical work timing
16: [1, 2, 3, 4], // Evening exercise (REVERSED)
17: [5, 4, 3, 2, 1], // 5-hour work block (5 options)
18: [5, 4, 3, 2, 1], // Time of feeling best (5 options)
19: [6, 4, 2, 0] // Self-identification (4 options, 0-6)
};
Score computation
function computeScore(responses) {
// responses is an object mapping question number (1-19)
// to chosen option index (0-based).
let total = 0;
for (let q = 1; q <= 19; q++) {
total += ITEM_OPTIONS[q][responses[q]];
}
return total; // integer in range [16, 86]
}
Score range verification
Sum of minimum points across all items: 1+1+1+1+1+1+1+1+1+1+0+0+1+1+1+1+1+1+0 = 16. Sum of maximum points: 5+5+4+4+4+4+4+4+4+5+6+5+4+4+4+4+5+5+6 = 86. The 16–86 range matches the established MEQ-19 scoring system exactly.
Categorization functions
function getBandHorne(score) {
if (score <= 30) return 'Definitely Evening';
if (score <= 41) return 'Moderately Evening';
if (score <= 58) return 'Intermediate';
if (score <= 69) return 'Moderately Morning';
return 'Definitely Morning';
}
function getBandTaillard(score) {
// Used when user age >= 40
if (score >= 65) return 'Morning type';
if (score >= 53) return 'Intermediate type';
return 'Evening type';
}
Note on cutoff exactness: The Horne & Östberg (1976) cutoffs are 30/31, 41/42, 58/59, 69/70 — exact integer boundaries. Scores at the boundary (e.g., 30 vs 31) place the user in different categories, which is by design in the original instrument. Some implementations use ≤ 30 vs. < 31; we use ≤ for consistency with the original tabulation.
Validation strategy
This is an original implementation in the framework of an externally validated questionnaire (the MEQ-19) rather than a verbatim implementation of it. Validation is staged across four levels:
Level 1 — Construct validity through framework adoption (complete)
The 19-item structure, scoring range (16–86), and category thresholds (30/31, 41/42, 58/59, 69/70) match the published MEQ-19 framework with citations to Horne & Östberg (1976) and Taillard et al. (2004). The five chronotype categories (Definitely Evening, Moderately Evening, Intermediate, Moderately Morning, Definitely Morning) and the five-category-to-three-category collapse for middle-aged populations are established in the literature.
Level 2 — Face validity through synthetic-profile testing (complete)
Four synthetic profiles spanning the chronotype spectrum were tested during development:
| Profile | Score | Horne & Östberg | Taillard |
|---|---|---|---|
| A. Strong night owl (last option each) | 25 | Definitely Evening | Evening type |
| B. Moderate evening (mixed evening-leaning) | 38 | Moderately Evening | Evening type |
| C. Intermediate (middle responses) | 59 | Moderately Morning | Intermediate type |
| D. Strong morning lark (first option each) | 77 | Definitely Morning | Morning type |
Distribution check passes (A=25 < B=38 < C=59 < D=77). Full 16–86 range is reachable. The Profile C result (score 59 mapping to "Moderately Morning" under Horne but "Intermediate type" under Taillard) is correct behavior demonstrating the dual-cutoff value: a 50-year-old user with this score sees a more accurate classification under Taillard.
Level 3 — Convergent validity against the original MEQ-19 (planned)
Planned for v2.0: administer this implementation alongside the original MEQ-19 in an n=400 sample, hypothesize Pearson r ≥ 0.85 based on framework parallelism and identical category-threshold structure. This validation has not been conducted.
Level 4 — Validation against objective circadian markers (future research direction)
The original MEQ-19 has been validated against DLMO (r = -0.41 to -0.76 across studies; Burgess 2015), core body temperature minimum (r ≈ -0.5; Horne & Östberg 1976), and cortisol peak. We do not claim this validation transfers automatically to our implementation. Such validation is a future research direction, not a current claim.
Score-band derivation
Primary classification — Horne & Östberg 1976
The five Horne & Östberg (1976) chronotype categories were derived from the original 1976 sample of 150 university students. The category thresholds were not derived from population percentiles but from sample-internal cluster boundaries that produced descriptive labels distributing along the morningness-eveningness continuum.
| Score range | Category | Description |
|---|---|---|
| 16–30 | Definitely Evening | Strong evening preference |
| 31–41 | Moderately Evening | Mild evening preference |
| 42–58 | Intermediate | No strong directional preference |
| 59–69 | Moderately Morning | Mild morning preference |
| 70–86 | Definitely Morning | Strong morning preference |
Secondary classification — Taillard et al. 2004
The Taillard 2004 cutoffs were derived from a sample of 566 middle-aged French workers (mean age 51.2 years, no shift workers, no sleep disorders) using multiple correspondence analysis followed by ascending hierarchical classification on the principal components. The analysis produced three clusters in the middle-aged sample with the following cutoffs:
| Score range | Category (Taillard) | Distribution in 2004 sample |
|---|---|---|
| ≥ 65 | Morning type | 28.1% |
| 53–64 | Intermediate type | 51.7% |
| < 53 | Evening type | 20.2% |
The Taillard cutoffs resolve a known systematic misclassification: when the original 1976 cutoffs are applied to the middle-aged sample, 62.1% classify as morning, 36.6% as intermediate, and only 2.2% as evening — implausibly skewed toward morningness. The recalibrated cutoffs produce a balanced distribution that better reflects true chronotype variation in middle-aged populations.
Why we display both
We display the Horne & Östberg classification as primary (because it is the most widely-recognized standard with the most cross-study comparability) and the Taillard classification as secondary when the user is 40 or older (because the Taillard cutoffs are calibrated for the user's age group and resolve the demographic-shift misclassification). This dual display lets the user see both their score on the standard scale and their age-appropriate categorization, without forcing either interpretation.
Limitations
Self-report limitations
Like all questionnaire-based chronotype instruments, this tool relies on self-report. Self-report is subject to recall bias (asking about preferences requires accessing semantic memory of typical patterns), social-desirability bias (cultural framing of morningness as virtuous may inflate scores in some users), and edge-case misclassification (users whose preferences shift across domains may produce composite scores that don't match any single category neatly). The 2013 Di Milia et al. review noted these limitations apply to the entire chronotype questionnaire literature.
Population-specific limitations
The 1976 cutoffs were derived from a 150-person sample of university students aged 18–32 — a relatively small N and a narrow demographic. The Taillard 2004 cutoffs were derived from 566 middle-aged French workers — larger N but cultural specificity that may not generalize cleanly to non-French, non-worker populations. Both cutoff systems are descriptive rather than population-normed for any specific demographic.
Implementation-specific limitations
Because this implementation uses original wording rather than verbatim MEQ-19 items, convergent validity with the original instrument has not been formally tested. We hypothesize r ≥ 0.85 based on framework parallelism and identical scoring structure, but this is unconfirmed. Additionally, this implementation has not been validated against objective circadian markers (DLMO, core body temperature minimum) directly; we infer reasonable convergent validity from the framework's prior validation but cannot claim it as tested.
Categorical labels are not population percentiles
Both the Horne & Östberg and Taillard cutoffs are descriptive thresholds, not empirically-derived percentiles for the general population. A user in the "Definitely Morning" range should not infer that they are in the top X% of the population for morningness — that inference would require population calibration that has not been conducted for the original instruments and certainly not for this implementation.
Test-retest reliability not established for this implementation
The original MEQ-19 has documented test-retest reliability with correlation coefficients typically r > 0.85 over 1-month intervals (Adan 2012 review). We have not separately tested test-retest reliability for this implementation. We expect comparable reliability based on framework parallelism but cannot claim it as established.
Not a diagnostic instrument
This is a measurement of natural circadian preference, not a diagnostic of any sleep disorder. Persistent inability to sleep at chronotype-appropriate times, daytime impairment, or extreme schedule misalignment may indicate a circadian rhythm sleep-wake disorder (Delayed Sleep Phase Disorder, Advanced Sleep Phase Disorder, Non-24-Hour Sleep-Wake Disorder, etc.) and warrants clinical evaluation by a sleep medicine specialist.
Independent review
This implementation has not undergone independent peer review. The framework on which it is based (the MEQ-19) has extensive peer-reviewed validation across multiple populations and decades. We welcome methodological critique from researchers in chronobiology and sleep medicine. Corrections submission form available here.
The methodology page itself, the score derivation logic, and the dual-cutoff system are documented openly so reviewers can examine the implementation without needing to reverse-engineer it.
Version log
- v1.0 — May 4, 2026 Initial release. 19 items implementing MEQ-19 framework. 16–86 scoring range. Five Horne & Östberg (1976) chronotype categories. Taillard et al. (2004) middle-age secondary classification displayed for users aged 40+. Circadian-arc visualization showing peak hours by category. Synthetic-profile sanity testing across four engagement profiles passed during development with all profiles matching expected category placements exactly.
Key terms
Cross-references to the LifeByLogic glossary for the technical terms used throughout this methodology page:
- Chronotype — the construct being measured
- Self-report — the data collection method used
- Validated instrument — what makes a measurement tool research-grade
- Effect size — statistical metrics used in the chronotype-health correlate literature
- Sleep cycle — 90-minute architecture of human sleep, complementary to chronotype for sleep-schedule design
Spanish translation status
Spanish translation is in scope under the LifeByLogic Spanish localization protocol but has not yet been initiated for the Chronotype Test. The Wave 2 Spanish kickoff schedule places Wave 2 tools (including this Chronotype Test) in the Wave 3 Spanish translation cycle.
The MEQ-19 has been validated in Spanish samples (Adan & Almirall 1991; subsequent Spanish-language validation studies). Spanish translation of this implementation should preserve the framework structure, point values, and category thresholds while adapting wording to natural Spanish expressions. A Spanish-native reviewer with chronobiology background will be required for the translation review.
Methodology FAQ
Recurring methodology questions, with answers grounded in the chronotype literature.
How were the point values for each item calibrated?
Point values follow the structure of the original MEQ-19. Most items contribute 4 points to the maximum (4-option items, 1–4 points). Items related to preferred timing (Q1, Q2, Q10, Q17, Q18) contribute 5 points (5-option items, 1–5). Items related to peak performance (Q11, Q12) and self-identification (Q19) contribute 6 points (4-option items, 0–6), reflecting the original instrument's treatment of these as higher-information items. The minimum sum is exactly 16, the maximum is exactly 86 — matching the established MEQ-19 range.
Why does the tool use two sets of cutoffs?
The original Horne & Östberg (1976) cutoffs were derived from a 150-student sample aged 18–32. They produce sensible distributions in young adults but systematically misclassify middle-aged adults — Taillard 2004 found 62% morning, 2% evening using the 1976 cutoffs in a middle-aged sample. The Taillard cutoffs (Evening <53, Intermediate 53–64, Morning ≥65) are calibrated for middle-aged populations and produce a balanced distribution there. We display both classifications when the user is 40+ so the result is correctly contextualized for life stage.
Why use the MEQ-19 framework rather than the MCTQ or a different instrument?
The MEQ-19 is the most-cited chronotype instrument in chronobiology (4,000+ citations vs. ~3,000 for MCTQ). It has been validated across more languages and populations. The MCTQ uses a different conceptualization (mid-sleep on free days as proxy) which has its own strengths but is less directly comparable to the established categorical chronotype literature. We chose the MEQ-19 framework because (a) maximum citation alignment with existing research, (b) widest cross-cultural validation, (c) most stable categorical thresholds, (d) availability of the Taillard middle-age cutoff system (no equivalent for MCTQ).
What is the validation status of this implementation?
Level 1 (construct validity through framework adoption): complete. Level 2 (face validity through synthetic profile sanity testing): complete with all 4 profiles matching expected category placements exactly. Level 3 (convergent validity against the original MEQ-19): planned for v2.0 with hypothesized r ≥ 0.85, not yet conducted. Level 4 (validation against objective circadian markers): future research direction. Users should treat the score as a structured self-assessment of circadian preference, not as a clinically validated diagnostic.
How does this differ from the Sleep-Cognition Optimizer's chronotype assessment?
The SCO uses a 7-item short-form chronotype assessment based on the Adan & Almirall (1991) reduced MEQ scale, used as one input among many to a sleep schedule generation algorithm. This Chronotype Test is the standalone, comprehensive 19-item assessment — the gold-standard chronotype measurement used in research. Higher information content, full 16–86 scoring range, dual-cutoff classification system. Users can use either tool standalone or take the Chronotype Test result as input to the SCO's broader sleep schedule generation.
Why does Profile C (intermediate test profile) score 59 rather than truly intermediate?
Profile C selects "second option" (option index 1) for each item. For 5-option items where the second option carries 4 points, this produces a slightly morning-leaning result. The intermediate range (42–58) on the established Horne scale is genuinely narrow; the test profile that lands in true intermediate would need to mix option indices 1 and 2. This is not a scoring bug — it correctly reflects the narrow intermediate band in the original MEQ-19. The Taillard 2004 cutoffs catch this case as "Intermediate" (Profile C falls in 53–64), which is exactly why we display both classifications: the dual-display correctly contextualizes edge-case scores.
What if a user repeatedly scores in different categories on retake?
Test-retest variability of 1–5 points is normal and expected — chronotype self-report depends on which week's pattern the user had in mind, recent sleep quality, and similar contextual factors. A user whose retake scores all fall within ±5 points of each other has a stable chronotype assessment. Larger swings (10+ points across retakes) suggest either the user is genuinely intermediate (where small variations cross category boundaries) or that situational factors (recent travel, illness, schedule disruption) are influencing answers. We recommend taking the test in a typical week, not during atypical periods.
How does chronotype interact with sleep need?
Chronotype and sleep need are independent dimensions. Chronotype determines when you naturally sleep; sleep need determines how much you naturally need. A strong morning type can have either short or long sleep need (some larks need 6 hours, others need 9). The same is true for evening types. Both dimensions matter for designing a sleep schedule, which is why the Sleep-Cognition Optimizer takes both as inputs. The Chronotype Test addresses only the when dimension.
Why is age the only demographic input?
Age is the only demographic with established alternative cutoff systems (Taillard 2004 for middle-aged populations). Other demographics (gender, geographic latitude, occupation) have effects on chronotype distribution but the effects are smaller, less consistently replicated, and don't have established alternative cutoff systems analogous to Taillard. Adding more demographic inputs without clear value-add would create unnecessary friction without improving classification accuracy.
Last reviewed
Last reviewed: May 4, 2026.
Next scheduled review: November 4, 2026.
Triggers for unscheduled review: publication of major new validation studies of the MEQ-19 in additional populations; identification of methodological errors via corrections; updates to the Horne & Östberg or Taillard cutoff systems; publication of objective-marker validation studies for this specific implementation.
For corrections, methodological critique, or research collaboration inquiries: submit via the corrections form.
How to cite this methodology
If you reference this methodology page in academic work, journalism, blog posts, or other publications, please cite it. The corporate author is LifeByLogic; the current version is 1.0 (2026-05-04). Choose the citation style appropriate for your venue.
@misc{lbl_chronotype_tool_2026,
author = {{LifeByLogic}},
title = {{Chronotype Test Methodology}},
year = {2026},
version = {1.0},
publisher = {{LifeByLogic}},
howpublished = {Interactive web tool},
url = {https://lifebylogic.com/brain-lab/chronotype-tool/methodology/},
note = {Accessed: May 4, 2026}
}
Note on authorship: LifeByLogic is the corporate author. Individual contributors are credited on the about page: this methodology was written by Abiot Y. Derbie, PhD, and reviewed by Eskezeia Y. Dessie, PhD. For non-academic citations (journalism, blog posts), citing “LifeByLogic” is appropriate; for academic citations, the formats above are the recommended structure.