Skip to content

refactor(query): extract SOCIAL_NOISE and VIRAL_NOISE shared sets#437

Open
iliaal wants to merge 1 commit into
mvanhorn:mainfrom
iliaal:refactor/extract-core-subject-wrappers
Open

refactor(query): extract SOCIAL_NOISE and VIRAL_NOISE shared sets#437
iliaal wants to merge 1 commit into
mvanhorn:mainfrom
iliaal:refactor/extract-core-subject-wrappers

Conversation

@iliaal
Copy link
Copy Markdown
Contributor

@iliaal iliaal commented May 19, 2026

Summary

Six adapters defined near-identical noise frozensets inline inside their _extract_core_subject wrapper. The architecture review surfaced this in the _extract_core_subject consolidation finding (Architecture F5). Move the shared sets to lib/query.py as SOCIAL_NOISE and VIRAL_NOISE; have the adapters reference them.

Mapping

  • SOCIAL_NOISE (18 words): Bluesky, Threads, Truth Social. Short-form micro-social platforms where research/meta words rarely appear in post bodies.
  • VIRAL_NOISE (25 words = SOCIAL_NOISE + 7 extras): TikTok, Instagram, Pinterest. Adds killer, the prompt-meta cluster, and the methodology cluster.
  • YouTube extends VIRAL_NOISE with temporal/meta tokens (months, recent year strings, etc.) that the planner emits but YouTube titles don't carry. Now composed as VIRAL_NOISE | _YT_EXTRA.

What's NOT in scope

  • Wrappers stay. The wrappers themselves are 3-line shims that document each adapter's noise choice; removing them would force every callsite to know the right noise set. Future cleanup if desired.
  • Polymarket's _extract_core_subject is a custom prefix-stripper that doesn't go through query.extract_core_subject at all. Out of scope.
  • Reddit uses the module-level default NOISE_WORDS. Out of scope.

Test plan

  • New TestSharedAdapterNoiseSets pins SOCIAL_NOISE membership (18 words), VIRAL_NOISE = SOCIAL_NOISE + 7 specific extras, and verifies extract_core_subject behavior with both sets.
  • Set arithmetic verified: old _YT_NOISE (52 items) = new VIRAL_NOISE | _YT_EXTRA (25 + 27 = 52). Zero behavior change for YouTube.
  • Existing adapter callsites unchanged; the wrappers continue to return the same value.
  • Full suite: pytest tests/ --ignore=tests/test_exa_search.py: 1556 passed, 4 skipped.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 19, 2026

Greptile Summary

This PR extracts six near-identical inline noise frozensets from adapter _extract_core_subject wrappers into two shared module-level constants — SOCIAL_NOISE (18 words, used by Bluesky/Threads/Truth Social) and VIRAL_NOISE (SOCIAL_NOISE + 7 extras, used by TikTok/Instagram/Pinterest) — in lib/query.py, and composes YouTube's set as VIRAL_NOISE | _YT_EXTRA. The arithmetic is verified: all seven adapters produce frozensets identical in content to their pre-refactor inline definitions.

  • lib/query.py: Two new public constants (SOCIAL_NOISE, VIRAL_NOISE) added; no changes to any existing function or constant.
  • Six adapters (bluesky, threads, truthsocial, tiktok, instagram, pinterest): inline frozenset definitions replaced by a single import; youtube_yt replaces its full inline set with VIRAL_NOISE | _YT_EXTRA.
  • tests/test_query.py: New TestSharedAdapterNoiseSets class pins exact membership of both constants and exercises extract_core_subject with each.

Confidence Score: 5/5

Pure mechanical refactoring — every adapter's effective noise set is byte-for-byte identical before and after, verified by tests and manual count. Safe to merge.

All seven adapters produce frozensets identical in content to their pre-refactor inline definitions; the new shared constants are pinned by membership-equality tests; no logic, API surface, or call sites were modified.

No files require special attention; the only open nit is the inline frozenset in TestCustomNoise.test_custom_noise_keeps_tips that duplicates VIRAL_NOISE.

Important Files Changed

Filename Overview
skills/last30days/scripts/lib/query.py Added SOCIAL_NOISE (18 words) and VIRAL_NOISE (SOCIAL_NOISE + 7 extras) as module-level frozenset constants; no changes to existing logic.
skills/last30days/scripts/lib/bluesky.py Replaced inline _BSKY_NOISE (18 words) with SOCIAL_NOISE import; identical frozenset content, zero behavioral change.
skills/last30days/scripts/lib/instagram.py Replaced inline _INSTAGRAM_NOISE (25 words) with VIRAL_NOISE import; identical frozenset content.
skills/last30days/scripts/lib/youtube_yt.py Replaced inline _YT_NOISE (52 items) with VIRAL_NOISE
tests/test_query.py Added TestSharedAdapterNoiseSets with 4 tests pinning SOCIAL_NOISE membership, VIRAL_NOISE superset relationship, and extract_core_subject behavior; existing TestCustomNoise still carries an inline frozenset identical to VIRAL_NOISE.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    SN["SOCIAL_NOISE\n18 words\nBluesky · Threads · Truth Social"]
    VN["VIRAL_NOISE\nSOCIAL_NOISE + 7 extras\nTikTok · Instagram · Pinterest"]
    YT["VIRAL_NOISE | _YT_EXTRA\n25 + 27 = 52 words\nYouTube"]

    SN -->|"+ killer, prompt*,\nmethods, strategies,\napproaches"| VN
    VN -->|"+ temporal/meta tokens\n(months, years, etc.)"| YT

    subgraph query.py
        SN
        VN
    end
    subgraph youtube_yt.py
        YT
    end
Loading

Fix All in Codex Fix All in Claude Code Fix All in Cursor Fix All in Conductor

Reviews (2): Last reviewed commit: "refactor(query): extract SOCIAL_NOISE an..." | Re-trigger Greptile

@tmchow
Copy link
Copy Markdown
Owner

tmchow commented May 22, 2026

@iliaal fix conflicts please

Six adapters defined near-identical noise frozensets inline inside
their _extract_core_subject wrapper. Move the shared sets to
lib/query.py as SOCIAL_NOISE (18 words, used by Bluesky/Threads/Truth
Social) and VIRAL_NOISE (25 words = SOCIAL_NOISE + 7 extras, used by
TikTok/Instagram/Pinterest); have the adapters reference them.

YouTube extends VIRAL_NOISE with temporal/meta tokens (months, recent
year strings, etc.) that the planner emits but YouTube titles don't
carry. Now composed as VIRAL_NOISE | _YT_EXTRA.

Wrappers stay; they document each adapter's noise choice and avoid
forcing callsites to know the right set. Polymarket's prefix-stripping
_extract_core_subject and reddit's NOISE_WORDS default are out of scope.

Set arithmetic verified: old _YT_NOISE (52 items) = new
VIRAL_NOISE | _YT_EXTRA (25 + 27 = 52). Zero behavior change.
@iliaal iliaal force-pushed the refactor/extract-core-subject-wrappers branch from 6332b6a to 08f16b3 Compare May 22, 2026 15:35
@iliaal
Copy link
Copy Markdown
Contributor Author

iliaal commented May 22, 2026

@tmchow rebased on top of main. Conflicts (all in the test files touched by the conftest.py centralization that merged earlier today) are resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants