Hybrid Writer — Full Factory Map

Condensed prose, not code. This page shows every major LLM call, what goes in, the actual prompt shape, what tool/client is used, what must come out, and then every major processing step after generation. Goal: make the full factory legible enough to debug where quality is slipping.

Source of truth: content_cli.py, content_orchestrator.py, core/dispatcher.py, core/prose_runner.py, core/assembler.py, core/humanizer.py, content_evaluator.py, agent briefs, and prose workflows.

0. Factory Overview

LayerWhat it doesMain code
PreflightOptional research bootstrap + audience persona generationcontent_cli.py, research_integration.py, persona_generator.py
Phase 19 writing agents generate the article ingredientscore/dispatcher.py + core/prose_runner.py
Phase 2Assembler turns components into article text; humanizer modifies itcore/assembler.py, core/humanizer.py
Phase 3Audience/persona evaluation scores the resultcontent_evaluator.py, core/pipeline.py
ReviewSlop audit, manual R/Y/G review, publishing decisioncore/slop_auditor.py + review site

1. Preflight Calls

1A. Perplexity research bundle

LLM/Web callTool/client: PerplexityClient.research_query()

Input: the content brief, derived audience, and a planned 3-cone query set: macro, mezzo, micro.

Actual prompt shape:

System: “You are a research analyst. Search the open web and return a concise JSON with: sources (array of {title,url}), keyFindings (array), and 140-char summary.”
User: “Research request (focus): query

Output requirement: cached Perplexity response, stored as assistant_text plus raw response. This is not yet article prose; it is research fuel.

What changes after this step: the run gets external research context and suggested queries. If quota is exhausted, the system degrades to a minimal response instead of failing hard.

1B. Persona generation

LLM callTool/client: OpenRouterClient via PersonaGenerator.generate()

Input: brief, primary audience label, and a compact summary of the research bundle.

Actual prompt shape:

“You are designing audience personas to evaluate an article… Create 12 personas grouped as core (3), expanded (6), adjacent (3)… Each persona should include name, role, goals, pains, content preferences, evaluation criteria, success signal. Return a single JSON object.”

Tool(s) called: OpenRouter only. Then an anti-persona generator creates contrarian readers for stress tests.

Output requirement: JSON personas file in workspace/current/evaluation/personas.json.

What changes after this step: the evaluator gets real-ish readers instead of static rubric-only scoring.

2. Phase 1 — The 9 Writing Calls

All nine agents are fed a readable context block, not just raw JSON. The prompt builder puts the content brief at the top, then product/audience metadata, SEO queries, previous output, and critic feedback where relevant. Soul text is inserted before the agent prompt when available.

2.1 Goal Definition

LLM callModel: gpt-5 · Rounds: 1 · Code: workflows/simple/goal-definition.prose

Input: content brief + topic metadata.

Actual prompt:

“Define a clear, testable goal for the article based on the brief. Use the framework: When [Audience] reads this on [Platform] they will [Feel/Do] because [Payoff]. Apply Brooks' Warm Heart / Cool Head test. Propose a one-sentence success metric.”

Output requirement: JSON with goalStatement, warmHeart, coolHead, successMetric, context.

2.2 Audience X-Ray

LLM callModel: gpt-5 · Rounds: 1

Input: brief + audience description.

Actual prompt:

“Synthesize a human audience model from the topic + niche… Produce 20 jobs-to-be-done, 3 top drivers, and 4 avatars with lived context, trigger event, shower-thought question, rewards, and hates. Avoid generic personas. Make them feel real.”

Output requirement: JSON only: JTBD list, top drivers, 4 avatars.

2.3 Authority Casting

LLM callModel: gpt-5 · Rounds: 1

Input: brief + audience + any product context.

Actual prompt:

“Select ideal authorial voices and synthesize a style profile. List 5–7 apex writers; apply King's honesty and Adams' attention filters. Select top-3 with 30-word style synopses. Produce a concise voiceProfile capturing diction, pacing, tone, and POV guidance.”

Output requirement: JSON with apex writers, top 3, and voice profile.

2.4 Title & Hook Forge

Generator + critic loopGenerator model: gpt-5 · Critic model: x-ai/grok-4-fast · Max rounds: 3

Input: brief + audience model + soul/voice profile.

Generator prompt:

“Generate titles + hooks that match the audience and the writer SOUL… Produce 10 titles and 10 hooks… Optimize for signal, taste, edge… Avoid generic scaffolding like ‘Have you ever wondered…’ or ‘In today’s world…’”

Critic prompt:

“Score each title/hook combo on CRAP: Clear, Relevant, At-a-glance, Provocative. Return JSON: {score, top3, improvements}.”

Output requirement: top 3 title/hook combos plus critic scores. Loop stops at score ≥ 85 or round 3.

2.5 Fragment Generator

Generator + critic loopGenerator model: gpt-5 · Critic: grok-4-fast

Input: brief, audience, voice, any research already available.

Generator prompt:

“Produce 40–50 fragments (anecdotes, stats, quotes, steps) aligned to the goal. Label Story / Science / Steps. Score each for relevance/impact and dedupe near-duplicates.”

Critic prompt:

“Check fragment set for diversity, specificity, and voice match. Penalize repetition and generic phrasing. Return JSON: {score, repetitions, gaps, fix_brief}.”

Output requirement: fragment set plus critic notes. Loop stops at score ≥ 85 or round 3.

2.6 Research & Gaps

LLM call + external research hookModel: gpt-5 · Rounds: 1

Input: brief, audience, and the external research bundle if Perplexity was available.

Actual prompt:

“Plan and perform research; highlight gaps to fill. Three-cone research: 3 seminal (macro), 5 recent (mezzo), 15 social (micro). Each with 140-char summaries and links when available. Identify gaps; propose targeted queries.”

Output requirement: JSON with macro / mezzo / micro research arrays, gaps, and context.

2.7 Structure Synthesizer

LLM callModel: gpt-5 · Rounds: 1

Input: all previous planning ingredients.

Actual prompt:

“Propose and rank 3 outline structures best suited to goal and audience. Consider templates like Listicle, Problem–Solution, Myth–Buster (or better fits). Build outlines with H2/H3 headings and bullet summaries. Rank by fit and explain ranking.”

Output requirement: JSON with 3 ranked structures and rationale.

2.8 Draft Creator

Writer + critic loopWriter model: gpt-5 · Critic model: grok-4-fast · Max rounds: 3

Input: topic assignment, audience summary, soul, selected title/hook, research bullets, prior audience notes if revising.

Writer prompt (core excerpt):

“Write the canonical longform blog post… Length must land 1800–2800 words… Open with a scene or concrete claim. Name the promise. Build an argument spine. Use one memorable framework. Include counterpressure. End with a landing… Return JSON only with title, dek, article_markdown, word_count, section_outline, notes_for_next_iteration.”

Critic prompt:

“Review the draft for clarity, structure, accuracy, and voice match. Score 0–100 with penalties for AI-smell. Return JSON: {score, must_fix, should_fix, notes, revision_brief}.”

Output requirement: canonical markdown draft + revision notes. Loop stops at score ≥ 85 or round 3.

2.9 Editor

Editor + critic loopEditor model: gpt-5 · Critic model: grok-4-fast · Max rounds: 3

Input: draft plus all earlier context.

Editor prompt:

“Apply persuasion overlay, emotional arc, and readability proofing. Persuasion: contrast, repetition, vivid examples; mark placements. Emotional arc: map curiosity → insight → resolution. Readability: clarity, flow, jargon trimming; suggest cuts/additions.”

Critic prompt:

“Evaluate edited output for flow, voice consistency, and AI-smell. Score 0–100 and flag any robotic phrasing. Return JSON: {score, robotic_lines, must_fix, notes}.”

Output requirement: polished markdown plus editor notes. Loop stops at score ≥ 88 or round 3.

3. Phase 2 — Post-LLM Processing

3.1 Article assembly

Processing stepCode: core/assembler.py

What changes: picks the chosen title, pulls article_markdown from draft_creator, and writes the initial article text.

Rules now: editor telemetry is no longer appended into article text. CTA, persuasion overlay, emotional arc, readability, timestamps, and editor suggestions are written to article_meta.json sidecar instead.

Known struggle: duplicate H1 can still happen if the draft already contains the title and the assembler prepends another top title.

3.2 Citation resolver

Processing stepCode: Humanizer.resolve_citations()

What changes: swaps inline source IDs like [ABC-123] for titled markdown links, then appends a references section if references were used.

3.3 Jargon trimmer

LLM-assisted processingCode: Humanizer.apply_jargon_trimmer()

Input: the assembled article plus the brief.

Actual prompt shape:

“You extract likely jargon for an article and recommend concise alternatives. Return JSON only following the schema: likely_jargon [{term, reason, suggested_alternatives}], forbidden_rewrites, borderline_terms… Brief: … Article excerpt (first 1200 chars): …”

Tool(s) called: OpenRouter via JSON guard, plus heuristic word list fallback.

Output requirement: list of suspect terms and simpler replacements; then the processor rewrites matching terms in the article.

3.4 Voice stylizer

LLM processingCode: Humanizer.apply_voice_stylizer()

Input: the full article, voice signature, optional quantified profile deltas.

Actual system prompt shape:

“You are a careful style editor making TARGETED adjustments to match a quantitative voice profile… Apply ONLY the specific edits listed below… Do NOT change meaning, facts, section headings, list structure, or formatting… Do NOT compress, summarize, or shorten.”

Output requirement: full adjusted text only. If output compresses too much, the original text is kept.

Known struggle: this pass has been implicated in over-compression and weird connective damage in some runs.

3.5 AI detection filters

Rule-based processingCode: Humanizer.apply_ai_detection_filters()

What changes: regex substitutions strip banned patterns (“As an AI”, “In conclusion”, “Let’s dive in”, etc.), then sentence variety/voice checks try to reduce robotic rhythm.

Known struggle: this sentence-variety logic likely contributes to the broken “X, and Y. Z, and …” connective damage in bad Kenro runs.

3.6 Human feedback gate (optional)

Human-in-the-loop stepCode: HumanFeedbackCollector.collect_iteration_review()

What changes: if human feedback is enabled, the terminal asks for a review signal before continuing. This was the original justification for some editor-note plumbing.

3.7 Voice Guardian overlay (optional)

Processing / optional rewriteCode: Humanizer.run_voice_guardian()

What changes: if enabled, compares article to voice samples and can harmonize style. Currently usually off in production runs.

4. Phase 3 — Audience Evaluation

4.1 Persona evaluation call

LLM callCode: content_evaluator.py · Model: defaults to config.default_model (currently often grok-4-fast)

Input: full article plus agent contributions, persona profile, human context, voice context.

Actual prompt excerpt:

“You are reading a content article after iteration X of development. Your job is to react in-character as this persona (first-person voice). This is not a generic rubric… Tell us what you felt as you read: where you leaned in, where you zoned out, where you rolled your eyes… Return JSON with overall_score, detailed_scores, reaction, strengths, weaknesses, recommendation.”

Output requirement: persona-by-persona JSON evaluations, aggregated into overall score, gate score, trend analysis, and per-agent feedback history.

4.2 Aggregation and gate

Processing stepCode: ScoreAggregator + core/pipeline.py

What changes: rolls up scores, improvement trend, cumulative averages, and narrative metrics. Phase 3 currently passes if quality passes and audience score is at least 20. This is one reason internal final scores can look over-optimistic.

5. Slop Audit and Release Review

5.1 Slop auditor

Rule engineCode: core/slop_auditor.py

What changes: scores the article against 100+ slop rules and remnant patterns. It can run in audit mode, clean mode, and JSON mode.

Rules summary: LLM Tell, Corporate, Filler, AI Phrase, Overwrought, Remnant. Hard-fail on any pipeline leakage. Also checks duplicate H1, target drift, and trailing scaffolding.

Important current state: this exists, but is not yet fully baked into the pipeline loop. It is a backstop, not the primary gate.

6. Where the Factory Is Struggling Right Now

7. Fastest Debug Order

  1. Wire slop auditor into the production loop as a hard gate.
  2. Fix duplicate H1 at assembly.
  3. Isolate which humanizer pass creates connective corruption (likely sentence-variety or stylizer path).
  4. Separate real editorial score from permissive internal final score.
  5. Keep the review dashboard showing version, module outputs, slop, and progression.

This is the condensed factory map. It is intentionally prose-first, but every step above is grounded in the current repo prompts and orchestration code.