Create an Image Evaluation Report Template in ChatGPT

Your own structured framework for assessing AI-generated images systematically.

It’s hard to definitively say how well a prompt creates an image in ChatGPT and other AI tools. Today’s prompt hands you the ultimate Image Evaluation Report template—complete with scoring tables and rubric guidelines—to ensure nothing slips through the cracks.

Today's Prompt: Create an Image Evaluation Report for ChatGPT image creation

Image Evaluation Report

1. Prompt Details

  • Original Prompt Text:
    [PROMPT THAT CREATED IMAGE]

2. Generated Image Reference 

  • ATTACHED TO CHAT

3. Evaluation Criteria

Below is a table for quick scoring. For each criterion, circle or record a numeric score and then provide comments explaining the rationale. If certain criteria don’t apply (e.g., “Style Consistency” when no style was specified), mark as N/A.

Criterion

Definition

Score (1–5)

Comments (strengths / issues)

Relevance

How closely the image content matches the core elements of the prompt.

[ ]

[e.g., “Sunset included, but missing foreground silhouette.”]

Creativity / Originality

How novel or imaginative the interpretation is, while still fitting the prompt.

[ ]

[e.g., “Interesting color palette, added surreal elements nicely.”]

Clarity of Representation

How clearly the intended subjects/elements are depicted (recognizable, unambiguous).

[ ]

[e.g., “Figure is a bit blurred; hard to tell it’s a dragon.”]

Style & Aesthetic Fit

If a style or mood was specified, how well does the image adopt that style?

[ ]

[e.g., “Cartoon style achieved, but line work is rough.”]

Composition & Framing

Quality of layout: balance, focal point, guiding the viewer’s eye.

[ ]

[e.g., “Main subject is center-aligned; could use rule-of-thirds.”]

Color & Lighting

Appropriateness of color palette, lighting, contrast for mood and readability.

[ ]

[e.g., “Colors are vivid but clash with intended calm mood.”]

Technical Quality

Resolution, sharpness, artifacts, noise, rendering glitches.

[ ]

[e.g., “Some artifacts around edges; overall resolution is decent.”]

Emotional / Impact

Does the image evoke the intended emotion or reaction?

[ ]

[e.g., “Feels serene, aligns with prompt’s tranquil vibe.”]

Text Elements (if present)

If the prompt required text (e.g., logo, banner), is it legible and well-integrated?

[ ]

[e.g., “Text is pixelated and hard to read.”]

Accessibility Considerations (optional)

Contrast and clarity for accessibility (e.g., avoid too-low contrast).

[ ]

[e.g., “Contrast low for visually impaired users.”]

Scoring scale guide (1–5):

  • 1 (Very Poor): Criterion is mostly missing or problematic.

  • 2 (Poor): Some attempt, but major issues that hinder desired result.

  • 3 (Fair/Adequate): Basic match, but noticeable shortcomings.

  • 4 (Good): Strong match with minor issues.

  • 5 (Excellent): Exemplary match, meets or exceeds expectations.

4. Criterion-by-Criterion Rubric Guidelines (Optional, for consistency among evaluators)
For teams or multiple evaluators, you may define more detailed rubrics. For example:

  • Relevance:

    • 5: All key elements mentioned in prompt appear correctly and prominently.

    • 3: Some key elements appear, but one or more important parts missing or inaccurate.

    • 1: Prompt elements largely missing or misinterpreted.

  • Creativity / Originality:

    • 5: Adds unexpected but fitting elements that enhance the concept.

    • 3: Literal interpretation, minimal novelty.

    • 1: Feels generic or cliché, no creative spark.

5. Overall Assessment & Feedback

  • Overall Score (average or weighted average): [e.g., “(Sum of scores ÷ number of applicable criteria) = 3.6” or weighted sum]

  • Strengths (what works well):

    • [e.g., “Striking color choices, clear main subject”]

  • Areas for Improvement:

    • [e.g., “Improve clarity of background elements; reduce artifacts around edges”]

  • Suggested Adjustments / Next Steps:

    • [e.g., “Try increasing CFG scale or steps for sharper rendering; specify ‘high detail’ in prompt”]

  • Final Recommendation:

[e.g., “Accept as-is for a rough draft; regenerate with refined prompt for final version.”]

Result

Why we like this prompt:

  • Defined Evaluation Task: Clearly establishes the goal—critically evaluating a generated image based on a known prompt—ensuring focus and relevance.

  • Structured Criteria-Based Format: Breaks down evaluation into specific, measurable categories with definitions, guiding the AI to provide consistent and thoughtful assessments.

  • Contextual Scoring System: Uses a clear 1–5 scale with qualitative anchors and example comments, making feedback both quantitative and insightful.

  • Flexible and Adaptable: Accounts for N/A responses and optional rubric guidelines, supporting both individual and team evaluations across diverse image types.

  • Action-Oriented Conclusion: Requests overall assessment, improvement suggestions, and a final recommendation, ensuring the output is not just reflective but also forward-looking and useful.

The Daily Prompt is brought to you by Prompt Perfect…

We use Prompt Perfect every day to craft clear, detailed, and optimized prompts for The Daily Prompt.

It ensures our prompts are structured, refined, and ready to generate the best AI responses possible.

If you want the same seamless experience, try the Unlimited Plan free for three days and see how much better your prompts can be with just one click.

Try it now and experience the difference.

Prompt Perfect Chrome Extension is exclusively available in Google Chrome Browser. It will not work in Edge, Brave, or other browsers.

Prompt Meme of the Day