Get in Touch

Evaluator-Optimizer (The Writer and the Demanding Editor Analogy)

evaluator-optimizer

Imagine the process of writing an important article for a magazine. The process is rarely perfect on the first try. It generally follows a cycle of creation and review.

  1. The Writer (Optimizer/Generator): Receives a task ("Write an article about the vineyards of San Rafael") and produces a first draft.
  2. The Editor (Evaluator): Receives that draft. Their job is not to write from scratch, but to evaluate the text based on clear criteria: Is the style correct? Is it easy to read? Does it cover all the requested points? Is the title appealing?
  3. The Feedback: The editor doesn't just say "I don't like it." They give specific and constructive feedback: "The content is good, but the second paragraph is confusing. Rework it. Also, the title is boring; find something more catchy."
  4. The Cycle (Loop): The writer takes that feedback, improves the draft, and returns it to the editor. This cycle repeats until the editor says, "Perfect! Publish it."

The Evaluator-Optimizer pattern works exactly like this:

  • An "Optimizer" LLM (the Writer) generates an initial response to a request.
  • A second "Evaluator" LLM (the Editor), armed with a prompt containing very clear evaluation criteria (a kind of checklist), reviews the first one's work.
  • The Evaluator generates feedback to improve the response.
  • The Optimizer uses that feedback to refine its work in a loop, until the response meets the defined quality standard.

This method is ideal when the final quality is critical and you have a clear way to measure how "good" a response is.


Real-World Example: "Professional Email Writing Assistant"

Imagine an AI tool that helps you draft important emails, ensuring they sound professional and courteous.

The User's Request:

"Help me write an email to an important client to reschedule next week's meeting. I want to move it to two weeks from now."

Refinement Cycle (Loop):

Iteration #1

  1. Optimizer LLM (The Rushed Writer):

    • Receives the request and generates a first draft that is functional but unpolished.
    • Output v1 (too direct):

      "Hi [Client], I won't be able to make it next week. Should we move the meeting to two weeks from now? Regards, Edgardo."

  2. Evaluator LLM (The Communications Editor):

    • Receives "Output v1." Its prompt contains a quality checklist: 1. Is the tone courteous and professional? 2. Does it apologize for the inconvenience? 3. Does it actively propose solutions (specific dates/times)? 4. Is the closing formal?
    • Generated Feedback: "The draft is too informal and abrupt. You should add a professional greeting, offer a brief apology, propose specific times to facilitate coordination, and use a more cordial closing."

Iteration #2

  1. Optimizer LLM (receiving the feedback):

    • Receives a new instruction: "Improve this email draft [Output v1] based on the following feedback: [Generated Feedback]."
    • Output v2 (polished and professional):

      "Dear [Client's Name],

      I hope this email finds you well.

      I am writing to request if it would be possible to reschedule our meeting scheduled for next week, due to an unavoidable last-minute matter.

      I would like to propose moving it to two weeks from now. I have availability on Tuesday, October 28th at 10:00 AM or Thursday, October 30th at 2:00 PM. Please let me know if either of those times works for you or if you prefer another time.

      I sincerely apologize for any inconvenience this may cause.

      Kind regards, Edgardo"

  2. Evaluator LLM (reviewing again):

    • Receives "Output v2" and runs it through its checklist again.
    • Final Feedback: "Excellent. This draft meets all criteria for professionalism, courtesy, and clarity. Approved."

Final Result: The tool doesn't give the user the first draft but rather the final, polished, and approved version, ensuring the communication with their client is impeccable.