Skip to content

Conversation

@Hoder-zyf
Copy link
Collaborator

@Hoder-zyf Hoder-zyf commented Jul 22, 2025

Description

This PR introduces an automated “Critique → Rewrite” stage into the data‑science proposal flow:

  • Prompt additions

    • Adds hypothesis_critique and hypothesis_rewrite blocks in prompts_v2.yaml, complete with JSON schemas and guidance to avoid over‑specification and historical overfit.
  • Pipeline implementation

    • hypothesis_critique() – scores each generated hypothesis on feasibility, alignment, and improvement direction, returning constructive feedback.
    • hypothesis_rewrite() – produces a single, decisive, testable hypothesis per item, integrating critique insights while preserving innovation.
    • Robust validation & fallback logic ensure all hypotheses receive usable critiques even if the model output is partially malformed.

Motivation and Context

Raw, auto‑generated hypotheses are often vague or at high risk.
By forcing a critique → guided rewrite loop we expect:

  1. Higher alignment with real challenge pain‑points.
  2. More actionable, test‑ready experiments.
  3. Better balance between safe incremental gains and bold innovation.

This should cut wasted GPU hours on un‑implementable ideas and boost leaderboard progress.

How Has This Been Tested?

  • If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

  1. Your own tests:

Types of changes

  • Fix bugs
  • Add new feature
  • Update documentation

📚 Documentation preview 📚: https://RDAgent--1106.org.readthedocs.build/en/1106/

@Hoder-zyf Hoder-zyf merged commit 71440f6 into main Jul 29, 2025
9 checks passed
@Hoder-zyf Hoder-zyf deleted the hypo_ce branch July 29, 2025 06:20
licong01-cloud pushed a commit to licong01-cloud/RD-Agent that referenced this pull request Dec 13, 2025
add hypo_critique and hypo_rewrite in proposal.py, controlled by `DS_RD_SETTING.enable_hypo_critique_rewrite` (default True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants