Evaluator-Optimizer (The Writer and the Demanding Editor Analogy)

Imagine the process of writing an important article for a magazine. The process is rarely perfect on the first try. It generally follows a cycle of creation and review.
- The Writer (Optimizer/Generator): Receives a task ("Write an article about the vineyards of San Rafael") and produces a first draft.
- The Editor (Evaluator): Receives that draft. Their job is not to write from scratch, but to evaluate the text based on clear criteria: Is the style correct? Is it easy to read? Does it cover all the requested points? Is the title appealing?
- The Feedback: The editor doesn't just say "I don't like it." They give specific and constructive feedback: "The content is good, but the second paragraph is confusing. Rework it. Also, the title is boring; find something more catchy."
- The Cycle (Loop): The writer takes that feedback, improves the draft, and returns it to the editor. This cycle repeats until the editor says, "Perfect! Publish it."
The Evaluator-Optimizer pattern works exactly like this:
- An "Optimizer" LLM (the Writer) generates an initial response to a request.
- A second "Evaluator" LLM (the Editor), armed with a
promptcontaining very clear evaluation criteria (a kind of checklist), reviews the first one's work. - The Evaluator generates feedback to improve the response.
- The Optimizer uses that feedback to refine its work in a loop, until the response meets the defined quality standard.
This method is ideal when the final quality is critical and you have a clear way to measure how "good" a response is.
Real-World Example: "Professional Email Writing Assistant"
Imagine an AI tool that helps you draft important emails, ensuring they sound professional and courteous.
The User's Request:
"Help me write an email to an important client to reschedule next week's meeting. I want to move it to two weeks from now."
Refinement Cycle (Loop):
Iteration #1
-
Optimizer LLM (The Rushed Writer):
- Receives the request and generates a first draft that is functional but unpolished.
- Output v1 (too direct):
"Hi [Client], I won't be able to make it next week. Should we move the meeting to two weeks from now? Regards, Edgardo."
-
Evaluator LLM (The Communications Editor):
- Receives "Output v1." Its
promptcontains a quality checklist: 1. Is the tone courteous and professional? 2. Does it apologize for the inconvenience? 3. Does it actively propose solutions (specific dates/times)? 4. Is the closing formal? - Generated Feedback:
"The draft is too informal and abrupt. You should add a professional greeting, offer a brief apology, propose specific times to facilitate coordination, and use a more cordial closing."
- Receives "Output v1." Its
Iteration #2
-
Optimizer LLM (receiving the feedback):
- Receives a new instruction: "Improve this email draft [Output v1] based on the following feedback: [Generated Feedback]."
- Output v2 (polished and professional):
"Dear [Client's Name],
I hope this email finds you well.
I am writing to request if it would be possible to reschedule our meeting scheduled for next week, due to an unavoidable last-minute matter.
I would like to propose moving it to two weeks from now. I have availability on Tuesday, October 28th at 10:00 AM or Thursday, October 30th at 2:00 PM. Please let me know if either of those times works for you or if you prefer another time.
I sincerely apologize for any inconvenience this may cause.
Kind regards, Edgardo"
-
Evaluator LLM (reviewing again):
- Receives "Output v2" and runs it through its checklist again.
- Final Feedback:
"Excellent. This draft meets all criteria for professionalism, courtesy, and clarity. Approved."
Final Result: The tool doesn't give the user the first draft but rather the final, polished, and approved version, ensuring the communication with their client is impeccable.