Skip to content

Natural Agentic "hooks" or "executors" for use within sims and gens #1733

@jlnav

Description

@jlnav

As drafted by Claude:

Agent Ensemble: Orchestrating LLM Agents with libEnsemble

Goal

Brainstorm and plan how libEnsemble can orchestrate ensembles of LLM agents, with generators as planning/reasoning agents and simulators as tool-using/execution agents.

Constraints & Preferences

  • Use modern gest-api Generator class and dataclass specs (SimSpecs, GenSpecs, etc.), not legacy dicts.
  • Prefer zero or minimal changes to libEnsemble core for initial implementation.
  • VOCS from gest_api.vocs, not xopt.
  • Keep it as simple as possible.

Architecture Mapping

libEnsemble Concept Agent Concept Notes
Generator Planning/reasoning agent suggest() produces tasks, ingest() learns from results
Simulator Execution/tool-using agent Receives task, performs work, returns results
Manager Orchestrator Routes tasks, maintains history, enforces exit criteria
VOCS Task schema Defines variables, objectives, constraints
History Shared memory NumPy structured array tracking all inputs/outputs
Allocation function Scheduling policy Decides which workers get which tasks

Levels of Ambition

Level 1: LLM as Generator

The generator is an LLM that produces candidate configurations (e.g., hyperparameters, prompts, code variants). The simulator is a traditional function that evaluates them. The LLM replaces a numerical optimization algorithm.

Level 2: LLM as Simulator

The simulator is an LLM that evaluates or processes inputs (e.g., code review, text analysis, classification). The generator can be a traditional sampling function. Workers run LLM inference calls in parallel.

Level 3: Full Agent Loop

Both generator and simulator are LLM-backed. The generator is a planner that reasons about what to try next, and the simulator is an executor that carries out tasks using tools. This is a multi-agent system orchestrated by libEnsemble's manager.

Key Findings

Natural Fit

  • The modern gest-api interface is already list[dict]-based, which maps naturally to LLM structured output and function-calling features.
  • Generator.suggest(num_points) -> list[dict]: each dict has VOCS variable names as keys, scalar values.
  • Generator.ingest(results: list[dict]) -> None: receives all VOCS fields back. Default is no-op.
  • Generator.returns_id: bool = False: if True, suggest includes _id key, ingest receives it back.
  • Modern simulator: def sim_f(input_dict: dict, **kwargs) -> dict — auto-wrapped via gest_api_sim.
  • libEnsemble supports async returns, persistent workers, active_recv, multiple comm backends (local/threads, MPI, TCP) — all helpful for slow LLM calls.

Friction Points

  • VOCS variables are numeric with bounds. ContinuousVariable has domain=[low, high], DiscreteVariable is set-based. There is no free-text/string variable type.
  • Workarounds for unstructured data:
    1. Task-ID indexing: use a numeric task_id in VOCS; generator maintains an internal mapping of task_id -> task_description.
    2. Extend VOCS with StringVariable (upstream gest-api change).
    3. Use user/constants side channels for metadata.
  • Context window limits: ingest() receives results but LLMs have finite context. May need summarization or RAG over History.
  • Token/cost management: No built-in mechanism; could use VOCS constraints or History tracking.

Key Decisions

  • Start with Level 1 + Level 2 (LLM as Generator + LLM as Simulator) requiring zero core changes, using existing gest-api interface.
  • Use numeric task_id indexing initially to avoid needing string VOCS variables.
  • Generator maintains internal mapping of task_id -> task_description.
  • LLM structured output / function-calling features map naturally to VOCS-derived JSON schemas.

Next Steps

  • Decide on a concrete use case (code generation, hyperparameter search, research, etc.)
  • Sketch end-to-end example with LLMGenerator (Generator subclass) and llm_simulator function
  • Determine whether to pursue VOCS extension for string/unstructured variables (upstream gest-api)
  • Consider token/cost management (VOCS constraints or History tracking)
  • Address LLM context window limits in ingest() (summarization/RAG over History)

Relevant Files

File Description
libensemble/generators.py LibensembleGenerator, PersistentGenInterfacer base classes
libensemble/gen_classes/external/sampling.py Pure gest-api UniformSample, UniformSampleArray
libensemble/gen_classes/sampling.py UniformSample via LibensembleGenerator
libensemble/gen_classes/gpCAM.py GP_CAM, GP_CAM_Covar (complex generator example with ingest)
libensemble/gen_classes/aposmm.py APOSMM via PersistentGenInterfacer
libensemble/specs.py SimSpecs, GenSpecs, AllocSpecs, ExitCriteria, LibeSpecs dataclasses
libensemble/ensemble.py Primary Ensemble interface
libensemble/worker.py Worker execution loop (recv -> handle -> send)
libensemble/manager.py Manager coordination, History updates, allocation calls
libensemble/comms/ Communication backends (local, MPI, TCP)
libensemble/tests/regression_tests/test_1d_sampling.py Minimal example
from gest_api.vocs import VOCS
import numpy as np

typical_vocs = VOCS(
    variables={"x1": [0, 1.0], "x2": [0, 10.0]},
    objectives={"y1": "MINIMIZE"},
    constraints={"c1": ["GREATER_THAN", 0.5]},
    constants={"constant1": 1.0},
)

...

def typical_simulator(Input, persis_info, sim_specs, libE_info):
    Output = np.zeros(Input.shape, dtype=sim_specs.output_dtype)

    for i in range(Input.shape[0]):
        Output["f"][i] = Input[i] * application_output()

    return Output

...

class TypicalGenerator:

    def __init__(self, vocs):
        self.vocs = vocs
        self.model = init_model(self.vocs)

    def suggest(self, num_points):
        return self.model.suggest(num_points)

    def ingest(self, points):
        self.model.ingest(points)

...

agentic_vocs = VOCS(
    variables={"possibilities": ["AMD example", "CUDA example"], 
               "strategies": ["single gpu", "multi gpu"]},
    objectives={"quality_score": "MAXIMIZE"},
    constraints={"cost": ["LESS_THAN", 100]},
)

...

def agentic_simulator(Input, persis_info, sim_specs, libE_info):
    Output = np.zeros(Input.shape, dtype=sim_specs.output_dtype)

    prompt_base = "Run the {} using {}"

    for i in range(Input.shape[0]):
        prompt = prompt_base.format(Input[i]["possibilities"], Input[i]["strategies"])
        Output["quality_score"][i] = call_llm(prompt)

    return Output

...

def AgenticGenerator:

    def __init__(self, vocs):
        self.model = init_llm("You're an agent that proposes GPU strategies.")
        self.prompt = ""
        for var, obj in vocs.variables.items():
            self.prompt += f"{var}: {', '.join(obj)}\n"
        self.prompt += "You must choose from the above possibilities and strategies, but explain why you've made your choice."

    def suggest(self, num_points):
        return self.model.chat("Please provide a strategy") * num_points

    def ingest(self, points):
        self.model.chat("Here's the results: " + str(points))

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Feature.

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions