Prompt Details

Guides an LLM to evaluate its output thoroughly using weighted criteria focused on accuracy, relevance, and quality to identify improvement opportunities.

Final Prompt

Have the LLM perform a deep, multi-dimensional review of its previous output. Define clear evaluation criteria emphasizing accuracy and relevance, along with additional quality factors. Implement a weighted scoring system that prioritizes these criteria according to their impact. Provide a tiered evaluation structure with explicit weighting and rationale to determine if and how the output could be improved.

Copy this prompt and run it in your preferred AI

David Hanley

Sep 16, 2025

Auto-generated withOpen AI LogoGPT-4.1 Mini (No Web Access) — try this prompt in your own model with web search enabled for best results

System Result

Deep Multi-Dimensional Review of Previous Output

1. Evaluation Criteria

  • Accuracy (40%): The correctness and factual precision of the content.
  • Relevance (30%): The extent to which the output addresses the specific prompt and user intent.
  • Clarity (10%): How clearly and understandably the information is conveyed.
  • Completeness (10%): Whether the output covers all necessary aspects of the prompt.
  • Conciseness (5%): The efficiency of the response without unnecessary elaboration.
  • Engagement (5%): How compelling or interesting the output is to read.

2. Weighted Scoring System

Criterion Weight Rationale
Accuracy 40 Most critical to ensure trustworthy and valid information.
Relevance 30 Ensures the response directly addresses the prompt’s need.
Clarity 10 Important for user comprehension and ease of use.
Completeness 10 Covers the breadth of the topic to avoid gaps in the response.
Conciseness 5 Avoids verbosity, maintaining focus and efficiency.
Engagement 5 Keeps the user interested, improving overall experience.

3. Tiered Evaluation Structure

  • Tier 1 (85-100%): Output is high quality; no substantial improvement needed.
  • Tier 2 (70-84%): Good quality; minor improvements recommended (e.g., clarity or conciseness).
  • Tier 3 (50-69%): Moderate quality; significant revisions needed to improve accuracy or relevance.
  • Tier 4 (Below 50%): Poor quality; substantial rewriting required to meet standards.

4. Improvement Determination

  • If the weighted score falls below Tier 1, identify criteria with the lowest scores.
  • Prioritize revisions on accuracy and relevance due to their higher impact.
  • Address clarity and completeness to enhance user understanding.
  • Refine conciseness and engagement to polish the output further.

Note: No personally identifiable information (PII) was found in the original prompt or output; thus, no substitutions were necessary.

Original Prompt

i want to have an llm review their previous output deeply and determine if they could have done a better job

[...7 items]