localLLM

LLMQP Drops! A New Queue Dispatcher. Let your LLM CODE ALL NIGHT.

LLM Queue Dispatcher. A Powerful Harness Drop will queue your localLLM all night and keep it working!

thinkmelt@protonmail.com

May 18, 2026 • 37 min read

We built a tool to let your LLM code All Night!

Everyone is raging about the houseLLM revolution. Turboquant made large contexts possible, and now MTP (Multiple Token Prediction) increased speeds considerably (and has been accepted into the main fork of llama.cpp.) But with all this local compute power a good harness that will let your LLM code all night was in order.

You prompt, it works, you come back.. you prompt - but you are sitting there all night bolted to the output. What if you could set 20 prompts up, have them queued, and have them sequentially execute in order even if one takes 5 minutes and the next takes and hour? This harness is designed exactly to do this!
You want a web gui interface because you just don't want to mangle json object back-ends.
It's already dockerized, pull it and run it!

LLM Queue Dispatcher (LLMQP) is designed exactly to do this, it can be very easily run and is completely docker ready!

docker pull docker.io/cnmcdee/llmqueue:latest
docker run -d --name mcp-llmqueue --restart unless-stopped -p 0.0.0.0:5012:5012 cnmcdee/llmqueue:latest

As soon as you download it it will offer you some powerful options. Lets go over it.

It binds to port 5012, and is designed to run locally, and to control and monitor multiple localLLM's plus API LLM's at the same time.

http://192.168.1.<your ip>:5012

Set your MCP Agents

It fully recognizes and uses CORS polling to read MCP Tools

Set your LLM Stack

One or many it's up to you

Done-Walk Prompting

You can queue as many prompts as you like, it will build your prompt and dispatch them to your multiple LLM's. monitoring the output!

It will show you prompt dispatches.

The prompts as they work can be monitored.

Understanding DONE-WALK.

Each prompt will finish with a unique 10 character serial. The software will monitor the output prompt for this serial queue. Upon seeing it it will automatically start the next prompt for you! Simply select that type of prompt and watch it work!

Done-Walk will walk your prompts one by one.

Code Drop. A POWERFUL MCP Agent

If you want to make your LLM work across contexts, pick up where it left off, add in this agent. You litterally tell your LLM 'Use the Process Manager to Create a Task to save your work point.' Then in the next prompt another LLM (or the next Context) can pickup where it left off 'Using the Process Manager - load the following task and keep working on it!)
Fully OpenSource!

OpenSource

This is fully OPEN SOURCED!!
Create an app.py and put in it this, you will need a venv with flask aiohttp etc..

import asyncio
import time
from flask import Flask, jsonify, render_template, request
import threading
import pickle
import os
import aiohttp
import json
import uuid
from typing import Dict, Any, List
import aiohttp
import json
import requests, sys

# ====================== CORE LLM CLIENT ======================
class AsyncLLMClient:
    def __init__(self, api_key: str, base_url: str = "https://api.openai.com/v1"):
        self.api_key = api_key
        self.base_url = base_url
        self.requests: Dict[str, Any] = {}
        self.monitor_tasks = []
        self._session = None
    async def __aenter__(self):
        self._session = aiohttp.ClientSession()
        return self
    async def __aexit__(self, exc_type, exc_val, exc_tb):
        if self._session:
            await self._session.close()
        for task in self.monitor_tasks:
            task.cancel()
        await asyncio.gather(*self.monitor_tasks, return_exceptions=True)
    async def send_request(self, prompt: str, **kwargs):
        request_id = f"req_{len(self.requests)}"
        request = {
            "prompt": prompt,
            "model": kwargs.get("model", "gpt-4o-mini"),
            "max_tokens": kwargs.get("max_tokens", 1024),
            "temperature": kwargs.get("temperature", 0.7),
            "response_chunks": [],
            "total_bytes": 0,
            "estimated_tokens": 0,
            "status": "idle",
            "error": None,
            "start_time": None,
            "end_time": None,
            "full_response": "",
            "request_id": request_id
        }
        self.requests[request_id] = request
        self.monitor_tasks.append(asyncio.create_task(self._monitor_bytes(request)))
        asyncio.create_task(self._execute_request(request))
        return request
    async def _monitor_bytes(self, request):
        request["status"] = "monitoring"
        request["start_time"] = time.time()
        last = 0
        try:
            while request["status"] == "monitoring":
                if len(request["response_chunks"]) > last:
                    new_data = b''.join(request["response_chunks"][last:])
                    request["total_bytes"] += len(new_data)
                    last = len(request["response_chunks"])
                    # Improved estimation for Llama.cpp and similar local models
                    # Llama.cpp typically uses ~3 characters per token (especially with code)
                    request["estimated_tokens"] = round(len(request["full_response"]) / 150)
                await asyncio.sleep(0.1)
        except asyncio.CancelledError:
            pass
        except Exception as e:
            request["status"] = "error"
            request["error"] = str(e)
        finally:
            request["end_time"] = time.time()
            if request["status"] != "error":
                request["status"] = "completed"
    async def _execute_request(self, request):
        url = f"{self.base_url}/chat/completions"
        headers = {"Authorization": f"Bearer {self.api_key}", "Content-Type": "application/json"}
        payload = {
            "model": request["model"],
            "messages": [{"role": "user", "content": request["prompt"]}],
            "max_tokens": request["max_tokens"],
            "temperature": request["temperature"],
            "stream": True
        }
        try:
            async with self._session.post(url, headers=headers, json=payload) as resp:
                if resp.status != 200:
                    request["status"] = "error"
                    request["error"] = await resp.text()
                    return
                async for chunk in resp.content.iter_chunked(1024):
                    if request["status"] == "error": break
                    request["response_chunks"].append(chunk)
                    request["full_response"] += chunk.decode('utf-8', errors='replace')
                request["status"] = "completed"
        except Exception as e:
            request["status"] = "error"
            request["error"] = str(e)
    async def _wait_for_completion(self, request):
        while request["status"] == "monitoring":
            await asyncio.sleep(0.1)
# ====================== BASE MANAGER ======================
class BaseManager:
    def __init__(self, state_file: str):
        self.state_file = state_file
        self.items: List[Dict] = []
        self._save_task = None
    def _get_serializable(self):
        return [{k: v for k, v in item.items() if k != "client"} for item in self.items]
    def save_state(self):
        try:
            with open(self.state_file, "wb") as f:
                pickle.dump(self._get_serializable(), f)
            print(f"State saved: {self.state_file}")
        except Exception as e:
            print(f"Save failed: {e}")
    def load_state(self):
        if os.path.exists(self.state_file):
            try:
                with open(self.state_file, "rb") as f:
                    self.items = pickle.load(f)
                print(f"Loaded {len(self.items)} items from {self.state_file}")
            except Exception:
                self.items = []
    def _save_after_change(self):
        self.save_state()
# ====================== MCP AGENT MANAGER ======================
class AsyncMCPAgentManager(BaseManager):
    def __init__(self):
        super().__init__("mcp_agents_state.pkl")
        self.scheduled_tasks = []
        self.load_state()
    async def __aenter__(self):
        for agent in self.items:
            if not agent.get("client"):
                client = AsyncLLMClient(agent["api_key"], agent["base_url"])
                agent["client"] = client
                await client.__aenter__()
        if self._save_task is None:
            self._save_task = asyncio.create_task(self._autosave_loop())
        return self
    async def __aexit__(self, *args):
        if self._save_task:
            self._save_task.cancel()
        self.save_state()
        for agent in self.items:
            if agent.get("client"):
                await agent["client"].__aexit__(None, None, None)
    async def _autosave_loop(self):
        while True:
            await asyncio.sleep(30)
            self.save_state()
    def list_mcp_agents(self):
        result = []
        for agent in self.items:
            client = agent.get("client")
            active = sum(1 for r in (client.requests.values() if client else {}) if r.get("status") == "monitoring")
            result.append({
                "agent_id": agent["agent_id"],
                "name": agent["name"],
                "description": agent.get("description", ""),
                "base_url": agent["base_url"],
                "enabled": agent.get("enabled", True),
                "enabled_tools": agent.get("enabled_tools", []),
                "active_requests": active,
                "total_requests": len(client.requests) if client else 0
            })
        return result
    async def create_mcp_agent(self, name, base_url, api_key, description="", enabled_tools=None, enabled=True):
        agent_id = str(uuid.uuid4())[:8]
        client = AsyncLLMClient(api_key, base_url)
        agent = {"agent_id": agent_id, "name": name, "description": description,
                 "base_url": base_url, "api_key": api_key, "enabled": enabled,
                 "enabled_tools": enabled_tools or [], "client": client}
        self.items.append(agent)
        await client.__aenter__()
        self._save_after_change()
        return agent
    async def update_mcp_agent(self, agent_id, **kwargs):
        agent = next((a for a in self.items if a["agent_id"] == agent_id), None)
        if not agent: return None
        for key in ("name", "description", "enabled", "enabled_tools"):
            if key in kwargs and kwargs[key] is not None:
                agent[key] = kwargs[key]
        if "base_url" in kwargs or "api_key" in kwargs:
            if agent.get("client"):
                await agent["client"].__aexit__(None, None, None)
            new_client = AsyncLLMClient(kwargs.get("api_key", agent["api_key"]),
                                        kwargs.get("base_url", agent["base_url"]))
            agent["base_url"] = kwargs.get("base_url", agent["base_url"])
            agent["api_key"] = kwargs.get("api_key", agent["api_key"])
            agent["client"] = new_client
            await new_client.__aenter__()
        self._save_after_change()
        return agent
    async def delete_mcp_agent(self, agent_id):
        for i, agent in enumerate(self.items):
            if agent["agent_id"] == agent_id:
                if agent.get("client"):
                    await agent["client"].__aexit__(None, None, None)
                del self.items[i]
                self._save_after_change()
                return True
        return False
    async def send_request(self, agent_id, prompt, **kwargs):
        agent = next((a for a in self.items if a["agent_id"] == agent_id), None)
        if not agent or not agent.get("enabled", False):
            raise ValueError(f"MCP Agent {agent_id} not found or disabled")
        return await agent["client"].send_request(prompt, **kwargs)
    async def distribute_prompts(self, prompts, agent_ids, mode="ALL_GET_ALL", **kwargs):
        results = []
        max_tokens = kwargs.get("max_tokens", 1024)
        temperature = kwargs.get("temperature", 0.7)
        model = kwargs.get("model", "gpt-4o-mini")

        if mode == "DONE_WALK":
            for i, prompt in enumerate(prompts):
                agent_id = agent_ids[i % len(agent_ids)]
                try:
                    req = await self.send_request(agent_id, prompt, max_tokens=max_tokens,
                                                  temperature=temperature, model=model)
                    agent = next((a for a in self.items if a["agent_id"] == agent_id), None)
                    if agent and agent.get("client"):
                        await agent["client"]._wait_for_completion(req)
                    results.append({"prompt_index": i, "agent_id": agent_id, "status": "completed"})
                except Exception as e:
                    results.append({"prompt_index": i, "agent_id": agent_id, "error": str(e)})
            return {"status": "success", "mode": "DONE_WALK", "details": results}

        # ALL_GET_ALL
        for prompt in prompts:
            for aid in agent_ids:
                try:
                    await self.send_request(aid, prompt, max_tokens=max_tokens,
                                            temperature=temperature, model=model)
                    results.append(f"Sent to {aid}")
                except Exception as e:
                    results.append(f"Error: {e}")
        return {"status": "success", "details": results}
    def decode_mcp_tools_list(self, json_data):
        """
        Decodes an MCP tools/list JSON-RPC response and returns clean structured data.
        """
        # Parse if input is a string
        if isinstance(json_data, str):
            try:
                data = json.loads(json_data)
            except json.JSONDecodeError as e:
                return {"success": False, "error": "Invalid JSON: " + str(e)}
        else:
            data = json_data
        tools = data.get("result", {}).get("tools", [])
        if not tools:
            return {"success": False, "error": "No tools found in the response."}
        decoded_tools = []
        def parse_input_schema(schema):
            props = schema.get("properties", {})
            required = set(schema.get("required", []))
            params = []
            for name, info in props.items():
                params.append({
                    "name": name,
                    "type": info.get("type", "any"),
                    "required": name in required,
                    "default": info.get("default")
                })
            return params
        for tool in tools:
            name = tool.get("name", "Unnamed")
            description = tool.get("description", "No description provided.")
            # Input parameters
            input_schema = tool.get("inputSchema", {})
            parameters = parse_input_schema(input_schema)
            # Output type
            output_schema = tool.get("outputSchema", {})
            output_type = output_schema.get("properties", {}) \
                .get("result", {}).get("type", "unknown")
            decoded_tools.append({
                "name": name,
                "description": description,
                "parameters": parameters,
                "output_type": output_type
            })
        return {
            "success": True,
            "tool_count": len(decoded_tools),
            "tools": decoded_tools
        }
    def parse_sse_mcp_response(self, sse_string):
        """
        Extracts the JSON payload from an SSE response (e.g. "event: message\ndata: {...}").
        """
        lines = [line.strip() for line in sse_string.strip().split("\n")]
        json_str = None
        for line in lines:
            if line.startswith("data:"):
                json_str = line[5:].strip()
                break
        if not json_str:
            return {"success": False, "error": "No 'data:' field found in SSE response."}
        try:
            return json.loads(json_str)
        except json.JSONDecodeError as e:
            return {"success": False, "error": "Invalid JSON in SSE data: " + str(e)}
    def scan_mcp_server(self, url, api_key=None):
        """
        Connects to an MCP server, performs initialize + tools/list,
        handles SSE responses, decodes the result, and returns clean structured data.
        """
        url = url.rstrip("/")
        if not url.endswith('mcp'):
            url += 'mcp'
        headers = {
            "Content-Type": "application/json",
            "Accept": "application/json, text/event-stream"
        }
        if api_key:
            headers["Authorization"] = f"Bearer {api_key}"

        # Step 1: Initialize session
        init_payload = {
            "jsonrpc": "2.0",
            "id": 99,
            "method": "initialize",
            "params": {
                "protocolVersion": "2024-11-05",
                "capabilities": {},
                "clientInfo": {"name": "flask-mcp-scanner", "version": "1.0"}
            }
        }
        init_response = requests.post(url, headers=headers, json=init_payload, timeout=15)
        if init_response.status_code != 200:
            return {"success": False, "error": f"Initialize failed: HTTP {init_response.status_code}"}

        server_name = ""
        try:
            init_text = init_response.text.strip()
            if "event:" in init_text:
                # SSE format
                json_rpc = self.parse_sse_mcp_response(init_text)
            else:
                json_rpc = init_response.json()

            if isinstance(json_rpc, dict):
                result = json_rpc.get("result", {})
                server_name = result.get("serverInfo", {}).get("name", "") or \
                              result.get("name", "")
        except Exception:
            pass  # fallback to empty name (will use URL hostname later)

        session_id = init_response.headers.get("Mcp-Session-Id") or \
                     init_response.headers.get("mcp-session-id")
        if not session_id:
            return {"success": False, "error": "No Mcp-Session-Id received from server."}

        # Step 2: Get tools list
        tools_headers = headers.copy()
        tools_headers["Mcp-Session-Id"] = session_id
        tools_payload = {
            "jsonrpc": "2.0",
            "id": 1,
            "method": "tools/list",
            "params": {}
        }
        tools_response = requests.post(url, headers=tools_headers, json=tools_payload, timeout=15)
        if tools_response.status_code != 200:
            return {"success": False, "error": f"Tools/list failed: HTTP {tools_response.status_code}"}

        # Step 3: Handle SSE or normal JSON
        raw_text = tools_response.text
        json_rpc = self.parse_sse_mcp_response(raw_text) if "event:" in raw_text else tools_response.json()

        if isinstance(json_rpc, dict) and "success" in json_rpc and not json_rpc.get("success"):
            return json_rpc

        # Step 4: Decode and return clean result
        retset =  self.decode_mcp_tools_list(json_rpc)

        if isinstance(retset, dict) and retset.get("success"):
            retset["server_name"] = server_name
        return retset


# ====================== LLM SERVER MANAGER ======================
class AsyncLLMServerManager(BaseManager):
    def __init__(self):
        super().__init__("llm_servers_state.pkl")
        self.load_state()
    async def __aenter__(self):
        for server in self.items:
            if not server.get("client"):
                client = AsyncLLMClient(server["api_key"], server["base_url"])
                server["client"] = client
                await client.__aenter__()
        if self._save_task is None:
            self._save_task = asyncio.create_task(self._autosave_loop())
        return self
    async def __aexit__(self, *args):
        if self._save_task:
            self._save_task.cancel()
        self.save_state()
        for server in self.items:
            if server.get("client"):
                await server["client"].__aexit__(None, None, None)
    async def _autosave_loop(self):
        while True:
            await asyncio.sleep(30)
            self.save_state()
    def list_llm_servers(self):
        result = []
        for server in self.items:
            client = server.get("client")
            active = sum(1 for r in (client.requests.values() if client else {}) if r.get("status") == "monitoring")
            result.append({
                "server_id": server.get("server_id"),
                "name": server["name"],
                "base_url": server["base_url"],
                "default_model": server.get("default_model", "gpt-4o-mini"),
                "active_requests": active
            })
        return result
    async def update_llm_server(self, server_id, **kwargs):
        server = next((s for s in self.items if s["server_id"] == server_id), None)
        if not server:
            return None
        for key in ("name", "default_model"):
            if key in kwargs and kwargs[key] is not None:
                server[key] = kwargs[key]
        if "base_url" in kwargs or "api_key" in kwargs:
            if server.get("client"):
                await server["client"].__aexit__(None, None, None)
            new_client = AsyncLLMClient(kwargs.get("api_key", server["api_key"]),
                                        kwargs.get("base_url", server["base_url"]))
            server["base_url"] = kwargs.get("base_url", server["base_url"])
            server["api_key"] = kwargs.get("api_key", server["api_key"])
            server["client"] = new_client
            await new_client.__aenter__()
        self._save_after_change()
        return server
    async def delete_llm_server(self, server_id):
        for i, server in enumerate(self.items):
            if server["server_id"] == server_id:
                if server.get("client"):
                    await server["client"].__aexit__(None, None, None)
                del self.items[i]
                self._save_after_change()
                return True
        return False
    async def create_llm_server(self, name, base_url, api_key, default_model="gpt-4o-mini", description=""):
        server_id = str(uuid.uuid4())[:8]
        client = AsyncLLMClient(api_key, base_url)
        server = {"server_id": server_id, "name": name, "description": description,
                  "base_url": base_url, "api_key": api_key, "default_model": default_model, "client": client}
        self.items.append(server)
        await client.__aenter__()
        self._save_after_change()
        return server
    async def distribute_prompts(self, prompts, server_ids, mcp_agent_ids=None, mcp_items=None, mode="ALL_GET_ALL",
                                 **kwargs):
        results = []
        max_tokens = kwargs.get("max_tokens", 1024)
        temperature = kwargs.get("temperature", 0.7)
        model = kwargs.get("model", "gpt-4o-mini")

        tool_context = ""
        if mcp_agent_ids and mcp_items:
            tool_names = []
            for agent in mcp_items:
                if agent.get("agent_id") in mcp_agent_ids:
                    tool_names.extend(agent.get("enabled_tools", []))
            if tool_names:
                tool_context = f"You have access to the following MCP agentic tools: {', '.join(tool_names)}.\nUse them when appropriate to solve the task.\n\n"

        for prompt in prompts:
            augmented_prompt = tool_context + prompt
            for sid in server_ids:
                try:
                    server = next((s for s in self.items if s["server_id"] == sid), None)
                    if not server or not server.get("client"):
                        continue
                    await server["client"].send_request(augmented_prompt,
                                                        max_tokens=max_tokens,
                                                        temperature=temperature,
                                                        model=model)
                    results.append(f"Sent to LLM server {sid}")
                except Exception as e:
                    results.append(f"Error sending to server {sid}: {e}")
        return {"status": "success", "details": results}
# ====================== MAIN MANAGER ======================
class AsyncMultiClientManager:
    def __init__(self):
        self.mcp_manager = AsyncMCPAgentManager()
        self.llm_manager = AsyncLLMServerManager()

    async def __aenter__(self):
        await self.mcp_manager.__aenter__()
        await self.llm_manager.__aenter__()
        return self

    async def __aexit__(self, *args):
        await self.mcp_manager.__aexit__(*args)
        await self.llm_manager.__aexit__(*args)
# ====================== FLASK APPLICATION ======================
async def main():
    manager = AsyncMultiClientManager()
    async with manager:
        app = Flask(__name__, template_folder='templates')
        app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 0
        app.manager = manager
        app.loop = asyncio.get_running_loop()

        @app.route('/')
        def dashboard():
            return render_template('dashboard.html')

        @app.route('/api/status')
        def api_status():
            data = {
                "mcp_agents": manager.mcp_manager.list_mcp_agents(),
                "llm_servers": manager.llm_manager.list_llm_servers(),
                "scheduled_tasks": manager.mcp_manager.scheduled_tasks
            }

            # Active requests per client (MCP Agents + LLM Servers) – full prompt + live response
            for agent in manager.mcp_manager.items:
                key = f"{agent['name']} ({agent['agent_id']})"
                client = agent.get("client")
                if client:
                    data[key] = {
                        "active_requests": sum(1 for r in client.requests.values() if r["status"] == "monitoring"),
                        "requests": [{
                            "request_id": rid,
                            "prompt_preview": r.get("prompt", ""),
                            "full_response": r.get("full_response", ""),
                            "status": r["status"],
                            "total_bytes": r["total_bytes"],
                            "estimated_tokens": r.get("estimated_tokens", 0),
                            "duration": round((r.get("end_time") or time.time()) - (r.get("start_time") or time.time()), 2)
                        } for rid, r in client.requests.items()]
                    }

            for server in manager.llm_manager.items:
                key = f"{server['name']} ({server['server_id']})"
                client = server.get("client")
                if client:
                    data[key] = {
                        "active_requests": sum(1 for r in client.requests.values() if r["status"] == "monitoring"),
                        "requests": [{
                            "request_id": rid,
                            "prompt_preview": r.get("prompt", ""),
                            "full_response": r.get("full_response", ""),   # ← This was missing – now fixed
                            "status": r["status"],
                            "total_bytes": r["total_bytes"],
                            "estimated_tokens": r.get("estimated_tokens", 0),
                            "duration": round((r.get("end_time") or time.time()) - (r.get("start_time") or time.time()), 2)
                        } for rid, r in client.requests.items()]
                    }

            return jsonify(data)

        @app.route('/api/mcp_agents', methods=['GET', 'POST'])
        def api_mcp_agents():
            if request.method == 'GET':
                return jsonify({"mcp_agents": manager.mcp_manager.list_mcp_agents()})
            data = request.get_json() or {}
            future = asyncio.run_coroutine_threadsafe(
                manager.mcp_manager.create_mcp_agent(
                    name=data.get("name"),
                    base_url=data.get("base_url"),
                    api_key=data.get("api_key", ""),
                    description=data.get("description", ""),
                    enabled_tools=data.get("enabled_tools")
                ), app.loop)
            agent = future.result()
            return jsonify({"status": "created", "agent": manager.mcp_manager._get_serializable()[-1]}), 201

        @app.route('/api/mcp_agents/<agent_id>', methods=['PUT', 'DELETE'])
        def api_mcp_agent(agent_id):
            if request.method == 'DELETE':
                success = asyncio.run_coroutine_threadsafe(
                    manager.mcp_manager.delete_mcp_agent(agent_id), app.loop).result()
                return jsonify({"status": "deleted" if success else "failed"})

            # === PUT: Update existing agent ===
            data = request.get_json() or {}
            future = asyncio.run_coroutine_threadsafe(
                manager.mcp_manager.update_mcp_agent(agent_id, **data), app.loop)
            updated_agent = future.result()

            if updated_agent is None:
                return jsonify({"status": "failed", "error": "Agent not found"}), 404

            # Return ONLY serializable data (matches list_mcp_agents format)
            return jsonify({
                "status": "updated",
                "agent": {
                    "agent_id": updated_agent["agent_id"],
                    "name": updated_agent["name"],
                    "description": updated_agent.get("description", ""),
                    "base_url": updated_agent["base_url"],
                    "enabled": updated_agent.get("enabled", True),
                    "enabled_tools": updated_agent.get("enabled_tools", []),
                    "active_requests": 0,  # will be recalculated on next dashboard refresh
                    "total_requests": 0
                }
            })

        @app.route('/api/llm_servers', methods=['GET', 'POST'])
        def api_llm_servers():
            if request.method == 'GET':
                return jsonify({"llm_servers": manager.llm_manager.list_llm_servers()})
            data = request.get_json() or {}
            future = asyncio.run_coroutine_threadsafe(
                manager.llm_manager.create_llm_server(
                    name=data.get("name"),
                    base_url=data.get("base_url"),
                    api_key=data.get("api_key", ""),
                    default_model=data.get("default_model", "gpt-4o-mini")
                ), app.loop)
            server = future.result()
            return jsonify({"status": "created", "server": manager.llm_manager._get_serializable()[-1]}), 201

        @app.route('/api/llm_servers/<server_id>', methods=['PUT', 'DELETE'])
        def api_llm_server(server_id):
            if request.method == 'DELETE':
                success = asyncio.run_coroutine_threadsafe(
                    manager.llm_manager.delete_llm_server(server_id), app.loop).result()
                return jsonify({"status": "deleted" if success else "failed"})
            data = request.get_json() or {}
            future = asyncio.run_coroutine_threadsafe(
                manager.llm_manager.update_llm_server(server_id, **data), app.loop)
            return jsonify({"status": "updated"})

        @app.route('/api/distribute_prompts', methods=['POST'])
        def api_distribute_prompts():
            data = request.get_json() or {}
            future = asyncio.run_coroutine_threadsafe(
                manager.llm_manager.distribute_prompts(
                    prompts=data.get("prompts", []),
                    server_ids=data.get("selected_servers", []),
                    mcp_agent_ids=data.get("selected_agents", []),
                    mcp_items=manager.mcp_manager.items,
                    mode=data.get("distribution_mode", "ALL_GET_ALL"),
                    max_tokens=data.get("max_tokens", 256000)
                ), app.loop)
            result = future.result()
            return jsonify(result)

        @app.route('/api/mcp_scan', methods=['POST'])
        def api_mcp_scan():
            data = request.get_json() or {}
            http_address = data.get("http_address")
            api_key = data.get("api_key")
            if not http_address:
                return jsonify({"success": False, "error": "Missing http_address"}), 400

            # Call directly - no asyncio wrapper needed for this synchronous method
            result = manager.mcp_manager.scan_mcp_server(http_address, api_key=api_key)
            return jsonify(result)

        def run_flask():
            app.run(host="0.0.0.0", port=5012, debug=False, use_reloader=False)

        threading.Thread(target=run_flask, daemon=True).start()
        print("✅ Flask dashboard started → http://localhost:5012")
        await asyncio.sleep(3600)


if __name__ == "__main__":
    asyncio.run(main())

Inside a templates folder create dashboard.html and put inside it.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>LLM Queue Dispatcher</title>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css" rel="stylesheet">
    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
    <style>
        body { padding: 30px; font-family: system-ui, -apple-system, sans-serif; background-color: #f8f9fa; }
        .card { margin-bottom: 25px; box-shadow: 0 4px 12px rgba(0,0,0,0.1); }
        .request-card { border-left: 5px solid #0d6efd; margin-bottom: 20px; }
        .status-monitoring { color: #0d6efd; font-weight: bold; animation: pulse 2s infinite; }
        .status-completed { color: #198754; }
        .status-error { color: #dc3545; }
        .thinking-box { background-color: #f8f9fa; border: 1px dashed #0d6efd; border-radius: 8px; padding: 18px; max-height: 420px; overflow-y: auto; font-size: 0.98rem; line-height: 1.6; }
        .progress-bar { transition: width 0.6s ease-in-out; }
        @keyframes pulse { 0%, 100% { opacity: 1; } 50% { opacity: 0.6; } }
        #last-updated { font-size: 0.9rem; color: #6c757d; }
        .countdown { font-family: monospace; font-weight: 700; letter-spacing: 1px; }
        .countdown.urgent { color: #dc3545; animation: pulse 1s infinite; }
        /* Pronounced ringed code blocks – entire block only, no per-line borders */
        .markdown-body pre {
            background-color: #e3f0ff !important;   /* more pronounced cool blue-gray */
            border: 3px solid #0d6efd !important;   /* strong blue ring */
            border-radius: 8px;
            padding: 16px !important;
            margin: 12px 0;
            overflow-x: auto;
        }
        .markdown-body pre code {
            background-color: transparent !important;
            border: none !important;
            padding: 0;
        }
    </style>
</head>
<body>
    <div class="container">
        <div class="d-flex justify-content-between align-items-center mb-4">
            <h1 class="mb-0">🤖 LLM Queue Dispatcher</h1>
            <div><span id="last-updated" class="text-muted">Last updated: just now</span></div>
        </div>
        <ul class="nav nav-tabs mb-4" role="tablist">
            <li class="nav-item"><button class="nav-link active" data-bs-toggle="tab" data-bs-target="#active-prompts">Active Prompts</button></li>
            <li class="nav-item"><button class="nav-link" data-bs-toggle="tab" data-bs-target="#done-walk-queue">DONE WALK QUEUE</button></li>
            <li class="nav-item"><button class="nav-link" data-bs-toggle="tab" data-bs-target="#mcp-agents">🔧 MCP Agents</button></li>
            <li class="nav-item"><button class="nav-link" data-bs-toggle="tab" data-bs-target="#llm-servers">🌐 LLM Servers</button></li>
            <li class="nav-item"><button class="nav-link" data-bs-toggle="tab" data-bs-target="#prompt-generator">PROMPT GENERATOR</button></li>
        </ul>

        <div class="tab-content">
            <!-- Active Prompts -->
            <div class="tab-pane fade show active" id="active-prompts">
                <div id="active-content">
                    <div class="text-center py-5">
                        <div class="spinner-border text-primary" role="status"></div>
                        <p class="mt-3 text-muted">Loading active prompts...</p>
                    </div>
                </div>
            </div>

            <!-- DONE WALK QUEUE -->
            <div class="tab-pane fade" id="done-walk-queue">
                <div id="done-walk-content">
                    <div class="text-center py-5">
                        <div class="spinner-border text-primary" role="status"></div>
                        <p class="mt-3 text-muted">Loading Done-Walk queue...</p>
                    </div>
                </div>
            </div>

            <!-- MCP Agents -->
            <div class="tab-pane fade" id="mcp-agents">
                <div class="card">
                    <div class="card-header bg-white border-0 d-flex justify-content-between align-items-center">
                        <h5 class="mb-0"><span class="badge bg-primary me-2">🔧</span>MCP Agents</h5>
                        <button class="btn btn-success" onclick="showAddAgentModal()"><i class="bi bi-plus-circle"></i> Add New MCP Agent</button>
                    </div>
                    <div class="card-body p-0">
                        <div class="table-responsive">
                            <table class="table table-hover mb-0" id="agents-table">
                                <thead class="table-light">
                                    <tr>
                                        <th>Name</th><th>Description</th><th>Base URL</th>
                                        <th>Enabled Tools</th><th class="text-end">Actions</th>
                                    </tr>
                                </thead>
                                <tbody id="agents-tbody"></tbody>
                            </table>
                        </div>
                    </div>
                </div>
            </div>

            <!-- LLM Servers -->
            <div class="tab-pane fade" id="llm-servers">
                <div class="card">
                    <div class="card-header bg-white border-0 d-flex justify-content-between align-items-center">
                        <h5 class="mb-0"><span class="badge bg-success me-2">🌐</span>LLM Servers</h5>
                        <button class="btn btn-success" onclick="showAddServerModal()"><i class="bi bi-plus-circle"></i> Add New LLM Server</button>
                    </div>
                    <div class="card-body p-0">
                        <div class="table-responsive">
                            <table class="table table-hover mb-0" id="servers-table">
                                <thead class="table-light">
                                    <tr>
                                        <th>Name</th><th>Base URL</th><th>Default Model</th>
                                        <th>Active Requests</th><th class="text-end">Actions</th>
                                    </tr>
                                </thead>
                                <tbody id="servers-tbody"></tbody>
                            </table>
                        </div>
                    </div>
                </div>
            </div>

            <!-- Prompt Generator -->
            <div class="tab-pane fade" id="prompt-generator">
                <div class="card-body">
                    <div class="mt-4">
                        <label class="form-label fw-bold">Generated Prompts (Demarc all single-line/multi-line prompts with -----)  </label>
                        <textarea id="generated-prompts" class="form-control" rows="10" style="font-family: monospace;"></textarea>
                    </div>

                    <div class="mt-4 border-top pt-4">
                        <h6 class="mb-3">Select MCP Agents <small class="text-muted">(Optional - Support Tools)</small></h6>
                        <div id="agent-selection" class="row"></div>

                        <h6 class="mb-3 mt-4">Select LLM Servers <span class="text-danger">*</span> <small class="text-muted">(Required)</small></h6>
                        <div id="server-selection" class="row"></div>

                        <div class="mt-4">
                            <label class="form-label fw-bold">Distribution Mode</label>
                            <div class="btn-group w-100" role="group">
                                <input type="radio" class="btn-check" name="dist-mode" id="mode-all" value="ALL_GET_ALL" checked>
                                <label class="btn btn-outline-primary" for="mode-all">All Selected Get All Prompts</label>
                                <input type="radio" class="btn-check" name="dist-mode" id="mode-donewalk" value="DONE_WALK">
                                <label class="btn btn-outline-primary" for="mode-donewalk">Done-Walk (Sequential)</label>
                            </div>
                        </div>

                        <button onclick="applyPrompts()" class="btn btn-success mt-4 w-100">Apply &amp; Distribute Prompts</button>
                    </div>
                </div>
            </div>
        </div>
    </div>
    <!-- ==================== UPDATED MCP AGENT MODAL ==================== -->
    <div class="modal fade" id="agentModal" tabindex="-1">
        <div class="modal-dialog modal-lg">
            <div class="modal-content">
                <div class="modal-header">
                    <h5 class="modal-title" id="agentModalLabel">Add New MCP Agent</h5>
                    <button type="button" class="btn-close" data-bs-dismiss="modal"></button>
                </div>
                <div class="modal-body">
                    <form id="agentForm">
                        <input type="hidden" id="agent-id" name="id">

                        <div class="row">
                            <div class="col-md-6">
                                <div class="mb-3">
                                    <label class="form-label">Name <span class="text-danger">*</span></label>
                                    <input type="text" id="agent-name" name="name" class="form-control" required>
                                </div>
                            </div>
                            <div class="col-md-6">
                                <div class="mb-3">
                                    <label class="form-label">Base URL <span class="text-danger">*</span></label>
                                    <input type="text" id="agent-baseurl" name="baseurl" class="form-control"
                                           placeholder="http://192.168.1.3:5000/mcp" required>
                                </div>
                            </div>
                        </div>

                        <!-- Description – now with better guidance -->
                        <div class="mb-3">
                            <label class="form-label">Description</label>
                            <textarea id="agent-description" name="description" class="form-control" rows="4"
                                      placeholder="Enter agent description (spaces and newlines are fully supported)"></textarea>
                            <small class="text-muted">Spaces, punctuation, and multi-line text are preserved.</small>
                        </div>
                        <div class="mb-3">
                            <label class="form-label">API Key</label>
                            <input type="password" id="agent-apikey" name="apikey" class="form-control">
                        </div>
                        <!-- Tools Section -->
                        <div class="mb-3">
                            <div class="d-flex justify-content-between align-items-center mb-2">
                                <label class="form-label mb-0">Enabled Tools</label>
                                <button type="button" class="btn btn-sm btn-info" onclick="scanMCP()">
                                    <i class="bi bi-broadcast"></i> SCAN MCP Tools
                                </button>
                            </div>
                            <select id="agent-tools-select" name="tools[]" class="form-select" multiple size="8"
                                    style="max-height: 280px;">
                                <!-- Populated by scanMCP() -->
                            </select>
                            <small class="text-muted">Hold Ctrl (Windows) or Cmd (Mac) to select multiple tools.<br>
                            Full tool descriptions appear on hover.</small>
                        </div>
                        <!-- Legacy text field (kept for backward compatibility) -->
                        <div class="mb-3">
                            <label class="form-label">Enabled Tools (Text) – Legacy</label>
                            <input type="text" id="agent-tools" name="tools_text" class="form-control"
                                   placeholder="add, subtract, evaluate_math">
                            <small class="text-muted">Comma-separated list (spaces around commas are ignored).</small>
                        </div>
                    </form>
                </div>
                <div class="modal-footer">
                    <button type="button" class="btn btn-secondary" data-bs-dismiss="modal">Cancel</button>
                    <button type="button" class="btn btn-primary" onclick="saveAgent()">
                        <i class="bi bi-save"></i> Save Agent
                    </button>
                </div>
            </div>
        </div>
    </div>

    <!-- LLM Server Modal (kept for completeness) -->
    <div class="modal fade" id="serverModal" tabindex="-1">
        <div class="modal-dialog modal-lg">
            <div class="modal-content">
                <div class="modal-header">
                    <h5 class="modal-title" id="serverModalLabel">Add New LLM Server</h5>
                    <button type="button" class="btn-close" data-bs-dismiss="modal"></button>
                </div>
                <div class="modal-body">
                    <form id="serverForm">
                        <input type="hidden" id="server-id">
                        <div class="mb-3"><label class="form-label">Server Name</label><input type="text" id="server-name" class="form-control" required></div>
                        <div class="mb-3"><label class="form-label">Base URL</label><input type="text" id="server-baseurl" class="form-control" required></div>
                        <div class="mb-3"><label class="form-label">API Key</label><input type="password" id="server-apikey" class="form-control"></div>
                        <div class="mb-3"><label class="form-label">Default Model</label><input type="text" id="server-model" class="form-control" value="gpt-4o-mini"></div>
                    </form>
                </div>
                <div class="modal-footer">
                    <button type="button" class="btn btn-secondary" data-bs-dismiss="modal">Cancel</button>
                    <button type="button" class="btn btn-primary" onclick="saveServer()">Save Server</button>
                </div>
            </div>
        </div>
    </div>

    <script>

        let currentAgents = [];
        let currentServers = [];
        let editingAgentId = null;
        let editingServerId = null;

        // NEW: Remember which MCP agents the user has selected/deselected in Prompt Generator
        let savedSelectedAgentIds = new Set();
        let savedSelectedServerIds = new Set();

        // Done-Walk sequential queue
        let doneWalkQueue = [];
        let isDoneWalkRunning = false;

        // ====================== MCP SCAN (ALL tools enabled by default) =======================
        function showAddAgentModal() {
                editingAgentId = null;
                document.getElementById('agentModalLabel').textContent = 'Add New MCP Agent';
                document.getElementById('agentForm').reset();
                document.getElementById('agent-tools-select').innerHTML =
                    '<option disabled>Click SCAN MCP Tools or press Enter in the Base URL field...</option>';

                const modal = new bootstrap.Modal(document.getElementById('agentModal'));
                modal.show();

                // Auto-trigger scan when user presses Enter in the Base URL field
                const baseUrlInput = document.getElementById('agent-baseurl');
                baseUrlInput.removeEventListener('keypress', handleBaseUrlEnter); // prevent duplicates

                function handleBaseUrlEnter(e) {
                    if (e.key === 'Enter') {
                        e.preventDefault();
                        scanMCP();
                    }
                }
                baseUrlInput.addEventListener('keypress', handleBaseUrlEnter);
            }

        // ====================== UPDATED: scanMCP (full JSON diagnostic + smart name extraction) =======================
        async function scanMCP() {
            const baseUrl = document.getElementById('agent-baseurl').value.trim();
            const apiKey = document.getElementById('agent-apikey').value.trim();
            if (!baseUrl) return alert("Please enter a Base URL first.");

            const scanBtn = document.querySelector('#agentModal .btn-info');
            const originalHTML = scanBtn.innerHTML;
            scanBtn.disabled = true;
            scanBtn.innerHTML = `<span class="spinner-border spinner-border-sm"></span> Scanning...`;



            try {
                const res = await fetch('/api/mcp_scan', {
                    method: 'POST',
                    headers: {'Content-Type': 'application/json'},
                    body: JSON.stringify({ http_address: baseUrl, api_key: apiKey || null })
                });

                const result = await res.json();


                if (!result.success) {
                    console.error('❌ [scanMCP] Backend reported error:', result.error);
                    return alert(`❌ ${result.error || 'Scan failed'}`);
                }

                const tools = result.tools || [];

                // ── SMART NAME EXTRACTION (tries many common MCP fields) ──
                let serverName = '';

                // Direct fields the backend might return
                if (result.server_name) serverName = result.server_name;
                else if (result.name) serverName = result.name;
                else if (result.serverInfo && result.serverInfo.name) serverName = result.serverInfo.name;
                else if (result.result && result.result.name) serverName = result.result.name;
                else if (result.result && result.result.server_name) serverName = result.result.server_name;

                // Fallback: clean hostname from URL
                if (!serverName) {
                    try {
                        const urlObj = new URL(baseUrl.startsWith('http') ? baseUrl : 'http://' + baseUrl);
                        serverName = urlObj.hostname.replace(/^www\./, '').toUpperCase() + ' MCP';
                    } catch (e) {
                        serverName = 'MCP Agent';
                    }
                }

                // Auto-fill the Name field
                document.getElementById('agent-name').value = serverName;

                // Populate tools dropdown
                const select = document.getElementById('agent-tools-select');
                select.innerHTML = '';

                tools.forEach(tool => {
                    const name = tool.name || tool.tool_name || '';
                    if (!name) return;

                    const opt = document.createElement('option');
                    opt.value = name;
                    opt.textContent = name;
                    opt.title = tool.description
                        ? tool.description.substring(0, 300) + (tool.description.length > 300 ? '...' : '')
                        : 'No description provided.';
                    opt.selected = true;
                    select.appendChild(opt);
                });

                const toolNames = tools.map(t => t.name || t.tool_name || '').filter(Boolean);
                document.getElementById('agent-tools').value = toolNames.join(', ');

                alert(`✅ Success! Discovered ${toolNames.length} tools.\n\nName field auto-filled with: "${serverName}" `);

            } catch (e) {
                console.error('🚨 [scanMCP] Exception:', e);
                alert("Failed to reach backend scan service.");
            } finally {
                scanBtn.disabled = false;
                scanBtn.innerHTML = originalHTML;
            }
        }

        // ====================== loadMCPTools (kept clean for Edit mode) =======================
        async function loadMCPTools(selectElement, baseUrl) {
            if (!selectElement || !baseUrl) return;
            selectElement.innerHTML = '<option disabled>Loading tools...</option>';

            try {
                const apiKey = document.getElementById('agent-apikey')?.value.trim() || null;

                const response = await fetch('/api/mcp_scan', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({ http_address: baseUrl, api_key: apiKey })
                });

                if (!response.ok) throw new Error(`HTTP ${response.status}`);

                const result = await response.json();
                selectElement.innerHTML = '';

                if (result.success && Array.isArray(result.tools) && result.tools.length > 0) {
                    result.tools.forEach(tool => {
                        const name = tool.name || tool.tool_name || '';
                        if (!name) return;
                        const opt = document.createElement('option');
                        opt.value = name;
                        opt.textContent = name;
                        opt.title = tool.description
                            ? tool.description.substring(0, 300) + (tool.description.length > 300 ? '...' : '')
                            : 'No description provided.';
                        selectElement.appendChild(opt);
                    });
                } else {
                    const msg = result.error ? result.error.substring(0, 60) : 'No tools found';
                    selectElement.innerHTML = `<option disabled>${msg}</option>`;
                }
            } catch (err) {
                console.error(err);
                selectElement.innerHTML = '<option disabled>Failed to load tools</option>';
            }
        }
        function showAddAgentModal() {
        editingAgentId = null;
        document.getElementById('agentModalLabel').textContent = 'Add New MCP Agent';
        document.getElementById('agentForm').reset();
        document.getElementById('agent-tools-select').innerHTML =
            '<option disabled>Click SCAN MCP Tools or press Enter in the Base URL field...</option>';

        const modal = new bootstrap.Modal(document.getElementById('agentModal'));
        modal.show();

        // ── NEW: Auto-trigger scan when user presses Enter in Base URL field ──
        const baseUrlInput = document.getElementById('agent-baseurl');

        // Remove any old listeners to avoid duplicates
        baseUrlInput.removeEventListener('keypress', handleBaseUrlEnter);

        function handleBaseUrlEnter(e) {
            if (e.key === 'Enter') {
                e.preventDefault();           // prevent form submission
                scanMCP();                    // same function used by the SCAN button
            }
        }

        baseUrlInput.addEventListener('keypress', handleBaseUrlEnter);
    }

        async function loadMCPTools(selectElement, baseUrl) {
            if (!selectElement || !baseUrl) return;

            selectElement.innerHTML = '<option disabled>Loading tools...</option>';

            try {
                const apiKey = document.getElementById('agent-apikey')?.value.trim() || null;

                const response = await fetch('/api/mcp_scan', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({
                        http_address: baseUrl,
                        api_key: apiKey
                    })
                });

                if (!response.ok) throw new Error(`HTTP ${response.status}`);

                const result = await response.json();

                selectElement.innerHTML = '';

                if (result.success && Array.isArray(result.tools) && result.tools.length > 0) {
                    result.tools.forEach(tool => {
                        const name = tool.name || tool.tool_name || '';
                        if (!name) return;

                        const opt = document.createElement('option');
                        opt.value = name;
                        opt.textContent = name;
                        opt.title = tool.description
                            ? tool.description.substring(0, 300) + (tool.description.length > 300 ? '...' : '')
                            : 'No description provided.';
                        selectElement.appendChild(opt);
                    });
                } else {
                    const msg = result.error ? result.error.substring(0, 60) : 'No tools found';
                    selectElement.innerHTML = `<option disabled>${msg}</option>`;
                }
            } catch (err) {
                console.error(err);
                selectElement.innerHTML = '<option disabled>Failed to load tools</option>';
            }
        }
        async function saveAgent() {
            const select = document.getElementById('agent-tools-select');
            const enabledTools = Array.from(select.selectedOptions).map(opt => opt.value);

            const payload = {
                name: document.getElementById('agent-name').value.trim(),
                description: document.getElementById('agent-description').value.trim(),
                base_url: document.getElementById('agent-baseurl').value.trim(),
                api_key: document.getElementById('agent-apikey').value.trim(),
                enabled_tools: enabledTools
            };

            if (!payload.name || !payload.base_url) {
                return alert("Name and Base URL are required.");
            }

            let url = '/api/mcp_agents';
            let method = 'POST';
            if (editingAgentId) {
                url = `/api/mcp_agents/${editingAgentId}`;
                method = 'PUT';
            }

            try {
                const res = await fetch(url, {
                    method: method,
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify(payload)
                });

                if (res.ok) {
                    bootstrap.Modal.getInstance(document.getElementById('agentModal')).hide();
                    editingAgentId = null;
                    updateDashboard();
                } else {
                    const errorText = await res.text().catch(() => 'Unknown error');
                    alert(`Failed to save agent: ${res.status} - ${errorText}`);
                }
            } catch (e) {
                console.error(e);
                alert("Error saving agent.");
            }
        }
        function editAgent(id) {
            const agent = currentAgents.find(a => a.agent_id === id);
            if (!agent) return;

            editingAgentId = id;
            document.getElementById('agentModalLabel').textContent = 'Edit MCP Agent';
            document.getElementById('agent-id').value = agent.agent_id || '';
            document.getElementById('agent-name').value = agent.name || '';
            document.getElementById('agent-description').value = agent.description || '';
            document.getElementById('agent-baseurl').value = agent.base_url || '';
            document.getElementById('agent-apikey').value = agent.api_key || '';

            const modal = new bootstrap.Modal(document.getElementById('agentModal'));
            modal.show();

            const toolsSelect = document.getElementById('agent-tools-select');
            if (agent.base_url) {
                loadMCPTools(toolsSelect, agent.base_url).then(() => {
                    const enabledSet = new Set((agent.enabled_tools || []));
                    Array.from(toolsSelect.options).forEach(opt => {
                        opt.selected = enabledSet.has(opt.value);
                    });
                });
            }
        }
        function renderAgentsTable() {
            const tbody = document.getElementById('agents-tbody');
            tbody.innerHTML = currentAgents.length === 0 ?
                `<tr><td colspan="5" class="text-center py-4 text-muted">No MCP agents yet.</td></tr>` : '';

            currentAgents.forEach(agent => {
                const count = (agent.enabled_tools || []).length;
                const toolList = (agent.enabled_tools || []).join(', ') || 'None';
                const row = document.createElement('tr');
                row.innerHTML = `
                    <td><strong>${agent.name}</strong></td>
                    <td style="white-space: pre-wrap; word-break: break-word;">
                        ${agent.description || '—'}
                    </td>
                    <td><code>${agent.base_url}</code></td>
                    <td>
                        <span class="badge bg-primary" title="${toolList}">
                            ${count} tool${count !== 1 ? 's' : ''}
                        </span>
                    </td>
                    <td class="text-end">
                        <button class="btn btn-sm btn-outline-primary me-1" onclick="editAgent('${agent.agent_id}')">Edit</button>
                        <button class="btn btn-sm btn-outline-danger" onclick="deleteAgent('${agent.agent_id}')">Delete</button>
                    </td>`;
                tbody.appendChild(row);
            });
        }
        async function loadMCPTools(selectElement, baseUrl) {
            if (!selectElement || !baseUrl) return;

            selectElement.innerHTML = '<option disabled>Loading tools...</option>';

            try {
                const apiKey = document.getElementById('agent-apikey')?.value.trim() || null;

                const response = await fetch('/api/mcp_scan', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({
                        http_address: baseUrl,
                        api_key: apiKey
                    })
                });

                if (!response.ok) throw new Error(`HTTP ${response.status}`);

                const result = await response.json();

                selectElement.innerHTML = '';

                if (result.success && Array.isArray(result.tools) && result.tools.length > 0) {
                    result.tools.forEach(tool => {
                        const toolName = tool.name || tool.tool_name || tool || '';
                        if (!toolName) return;

                        const opt = document.createElement('option');
                        opt.value = toolName;
                        opt.textContent = toolName;
                        selectElement.appendChild(opt);
                    });
                } else {
                    const msg = result.error ? result.error.substring(0, 60) : 'No tools found';
                    selectElement.innerHTML = `<option disabled>${msg}</option>`;
                }
            } catch (err) {
                console.error(err);
                selectElement.innerHTML = '<option disabled>Failed to load tools</option>';
            }
        }
        function renderServersTable() {
            const tbody = document.getElementById('servers-tbody');
            tbody.innerHTML = currentServers.length === 0 ?
                `<tr><td colspan="5" class="text-center py-4 text-muted">No LLM servers yet.</td></tr>` : '';

            currentServers.forEach(server => {
                const row = document.createElement('tr');
                row.innerHTML = `
                    <td><strong>${server.name}</strong></td>
                    <td><code>${server.base_url}</code></td>
                    <td>${server.default_model || '—'}</td>
                    <td>${server.active_requests || 0}</td>
                    <td class="text-end">
                        <button class="btn btn-sm btn-outline-primary me-1" onclick="editServer('${server.server_id}')">Edit</button>
                        <button class="btn btn-sm btn-outline-danger" onclick="deleteServer('${server.server_id}')">Delete</button>
                    </td>`;
                tbody.appendChild(row);
            });
        }
        async function deleteAgent(id) {
            if (!confirm("Delete this MCP agent permanently?")) return;
            try {
                const res = await fetch(`/api/mcp_agents/${id}`, { method: 'DELETE' });
                if (res.ok) updateDashboard();
            } catch (e) { console.error(e); }
        }
        // ====================== LLM SERVER CRUD ======================
        function showAddServerModal() {
            editingServerId = null;
            document.getElementById('serverModalLabel').textContent = 'Add New LLM Server';
            document.getElementById('serverForm').reset();
            new bootstrap.Modal(document.getElementById('serverModal')).show();
            const baseUrlInput = document.getElementById('agent-baseurl');
            const toolsSelect = document.getElementById('agent-tools-select');

            baseUrlInput.addEventListener('blur', async () => {
                const url = baseUrlInput.value.trim();
                if (url) {
                    await loadMCPTools(toolsSelect, url);
                }
            });
        }
        function editServer(id) {
            const server = currentServers.find(s => s.server_id === id);
            if (!server) return;
            editingServerId = id;
            document.getElementById('serverModalLabel').textContent = 'Edit LLM Server';
            document.getElementById('server-id').value = server.server_id;
            document.getElementById('server-name').value = server.name || '';
            document.getElementById('server-baseurl').value = server.base_url || '';
            document.getElementById('server-apikey').value = server.api_key || '';
            document.getElementById('server-model').value = server.default_model || 'gpt-4o-mini';
            new bootstrap.Modal(document.getElementById('serverModal')).show();
        }
        async function saveServer() {
            const payload = {
                name: document.getElementById('server-name').value.trim(),
                base_url: document.getElementById('server-baseurl').value.trim(),
                api_key: document.getElementById('server-apikey').value.trim(),
                default_model: document.getElementById('server-model').value.trim()
            };

            if (!payload.name || !payload.base_url) return alert("Name and Base URL required.");

            const url = editingServerId ? `/api/llm_servers/${editingServerId}` : '/api/llm_servers';
            const method = editingServerId ? 'PUT' : 'POST';

            try {
                const res = await fetch(url, {
                    method: method,
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify(payload)
                });
                if (res.ok) {
                    bootstrap.Modal.getInstance(document.getElementById('serverModal')).hide();
                    updateDashboard();
                } else alert("Failed to save server.");
            } catch (e) { alert("Error saving server."); }
        }
        async function deleteServer(id) {
            if (!confirm("Delete this LLM server permanently?")) return;
            try {
                const res = await fetch(`/api/llm_servers/${id}`, { method: 'DELETE' });
                if (res.ok) updateDashboard();
            } catch (e) { console.error(e); }
        }
        // Helper to generate a unique 10-character trigger (used by applyPrompts)
        function generate10CharTrigger() {
            const chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
            let result = '';
            for (let i = 0; i < 10; i++) {
                result += chars.charAt(Math.floor(Math.random() * chars.length));
            }
            return result;
        }

        // ====================== INITIALIZATION ======================
        function formatCountdown(seconds) {
                    if (seconds <= 0) return "Now";
                    const h = Math.floor(seconds / 3600);
                    const m = Math.floor((seconds % 3600) / 60);
                    const s = Math.floor(seconds % 60);
                    return h > 0 ? `${h}h ${m}m` : `${m}m ${s}s`;
                }
        function generatePrompts() {
            const base = document.getElementById('base-prompt').value.trim();
            const c1 = document.getElementById('col1').value.split(',').map(s => s.trim()).filter(Boolean);
            const c2 = document.getElementById('col2').value.split(',').map(s => s.trim()).filter(Boolean);
            let out = '';
            if (c1.length && c2.length) {
                c1.forEach(a => c2.forEach(b => out += base.replace('{col1}', a).replace('{col2}', b) + '\n\n'));
            } else out = base;
            document.getElementById('generated-prompts').value = out.trim();
        }
        function renderAgentSelection() {
            const container = document.getElementById('agent-selection');

            // Rebuild the UI using the saved state as the single source of truth
            container.innerHTML = currentAgents.length === 0 ?
                '<div class="col-12 text-muted">No MCP agents available.</div>' : '';

            currentAgents.forEach(agent => {
                const isChecked = savedSelectedAgentIds.has(agent.agent_id);

                const div = document.createElement('div');
                div.className = 'col-md-4 mb-2';
                div.innerHTML = `<div class="form-check">
                    <input class="form-check-input agent-checkbox" type="checkbox"
                           id="ag-${agent.agent_id}" value="${agent.agent_id}"
                           ${isChecked ? 'checked' : ''}>
                    <label class="form-check-label" for="ag-${agent.agent_id}">${agent.name}</label>
                </div>`;
                container.appendChild(div);
            });

            // Attach change listeners so user deselections are immediately saved
            container.querySelectorAll('.agent-checkbox').forEach(cb => {
                cb.addEventListener('change', function () {
                    if (this.checked) {
                        savedSelectedAgentIds.add(this.value);
                    } else {
                        savedSelectedAgentIds.delete(this.value);
                    }
                });
            });
        }
        function renderServerSelection() {
            const container = document.getElementById('server-selection');

            // Capture current user selections before rebuild
            const currentlyChecked = new Set();
            container.querySelectorAll('.server-checkbox:checked').forEach(cb => {
                currentlyChecked.add(cb.value);
            });

            // Update persistent state
            if (currentlyChecked.size > 0) {
                savedSelectedServerIds = currentlyChecked;
            }

            // Rebuild UI
            container.innerHTML = currentServers.length === 0 ?
                '<div class="col-12 text-muted">No LLM servers available.</div>' : '';

            currentServers.forEach(server => {
                const isChecked = savedSelectedServerIds.has(server.server_id);

                const div = document.createElement('div');
                div.className = 'col-md-4 mb-2';
                div.innerHTML = `<div class="form-check">
                    <input class="form-check-input server-checkbox" type="checkbox"
                           id="sv-${server.server_id}" value="${server.server_id}"
                           ${isChecked ? 'checked' : ''}>
                    <label class="form-check-label" for="sv-${server.server_id}">${server.name}</label>
                </div>`;
                container.appendChild(div);
            });

            // Re-attach change listeners
            container.querySelectorAll('.server-checkbox').forEach(cb => {
                cb.addEventListener('change', function () {
                    if (this.checked) {
                        savedSelectedServerIds.add(this.value);
                    } else {
                        savedSelectedServerIds.delete(this.value);
                    }
                });
            });
        }
        async function runDoneWalkSequence() {
            if (doneWalkQueue.length === 0 || isDoneWalkRunning) return;
            isDoneWalkRunning = true;

            const selectedLLMServers = Array.from(document.querySelectorAll('.server-checkbox:checked'))
                                           .map(cb => cb.value);

            console.log('🚀 Starting Done-Walk sequence with', doneWalkQueue.length, 'prompts');

            while (doneWalkQueue.length > 0) {
                const item = doneWalkQueue.shift();
                const currentPrompt = item.prompt.trim();
                const triggerId = item.trigger;

                if (!currentPrompt) continue;

                console.log('📤 Sending prompt with unique trigger:', triggerId);

                // Strong, explicit instruction (proven to make LLM output trigger at the very end)
                const instruction = "\n\n--- FINAL INSTRUCTION ---\n" +
                                    "You have now completed your full and final response to the prompt above.\n" +
                                    "At the very end of your entire reply, you MUST output exactly this 10-character trigger " +
                                    "and NOTHING ELSE after it (no period, no extra text, no newlines after it):\n\n" +
                                    triggerId;

                const augmentedPrompt = currentPrompt + instruction;

                // Send ONLY this one prompt
                try {
                    await fetch('/api/distribute_prompts', {
                        method: 'POST',
                        headers: { 'Content-Type': 'application/json' },
                        body: JSON.stringify({
                            prompts: [augmentedPrompt],
                            selected_agents: [],
                            selected_servers: selectedLLMServers,
                            distribution_mode: "DONE_WALK",
                            max_tokens: 4096
                        })
                    });
                } catch (e) {
                    console.error('❌ Failed to send prompt:', e);
                }

                updateDashboard(); // immediate UI refresh
                await new Promise(r => setTimeout(r, 1200));

                console.log('🔍 Starting poll loop – waiting for trigger in MAIN OUTPUT only:', triggerId);

                let triggerDetected = false;
                let pollCount = 0;

                while (!triggerDetected && pollCount < 120) {   // ~3-minute safety limit
                    pollCount++;
                    await new Promise(r => setTimeout(r, 1500));

                    const statusRes = await fetch('/api/status');
                    const statusData = await statusRes.json();

                    triggerDetected = Object.values(statusData).some(info => {
                        if (!info || !info.requests) return false;

                        return info.requests.some(r => {
                            const rawResponse = r.full_response || r.response || r.content || '';
                            if (!rawResponse) return false;

                            const parsed = cleanLLMResponse(rawResponse);
                            const foundInMainOutput = parsed.content && parsed.content.includes(triggerId);

                            if (foundInMainOutput) {
                                console.log('✅ TRIGGER FOUND IN MAIN OUTPUT (poll #' + pollCount + ') – request:', r.request_id);
                            } else if (parsed.content) {
                                console.log('❌ Trigger not in main output yet (poll #' + pollCount + '). Last 200 chars of content:',
                                            parsed.content.slice(-200));
                            }
                            return foundInMainOutput;
                        });
                    });

                    if (!triggerDetected) {
                        console.log(`⏳ Poll #${pollCount} – trigger "${triggerId}" still not in main output`);
                    }
                }

                if (triggerDetected) {
                    console.log('✅ Unique trigger', triggerId, 'detected in main output – advancing to next prompt');
                } else {
                    console.warn('⚠️ Timeout waiting for trigger – proceeding anyway');
                }

                updateDashboard(); // live queue update
            }

            isDoneWalkRunning = false;
            console.log('🎉 Done-Walk sequence completed');
            alert("✅ Done-Walk sequence completed.");
            updateDashboard();
        }
        function switchToActivePromptsTab() {
            const tabButton = document.querySelector('button[data-bs-target="#active-prompts"]');
            if (tabButton) {
                const tab = new bootstrap.Tab(tabButton);
                tab.show();
                console.log('🔄 Switched UI to Active Prompts tab');
            }
        }
        async function applyPrompts() {
            const selectedMCPAgents = Array.from(document.querySelectorAll('.agent-checkbox:checked')).map(cb => cb.value);
            const selectedLLMServers = Array.from(document.querySelectorAll('.server-checkbox:checked')).map(cb => cb.value);
            let rawText = document.getElementById('generated-prompts').value.trim();
            if (!rawText) return alert("No prompts generated.");

            // Extract prompts using the ----- separator key
            let prompts = rawText.split('-----')
                                 .map(p => p.trim())
                                 .filter(p => p.length > 0);

            if (selectedLLMServers.length === 0) {
                return alert("Please select at least one LLM Server.");
            }

            const mode = document.querySelector('input[name="dist-mode"]:checked').value;

            if (mode === "DONE_WALK") {
                // Sequential Done-Walk mode (unchanged)
                const queueWithTriggers = prompts.map(prompt => ({
                    prompt: prompt,
                    trigger: generate10CharTrigger()
                }));

                doneWalkQueue = queueWithTriggers;
                console.log('✅ Done-Walk queue initialized with', queueWithTriggers.length, 'prompts and triggers');
                runDoneWalkSequence();
                switchToActivePromptsTab();
                return;
            }

            // === NEW BEHAVIOR: ALL_GET_ALL now distributes prompts EQUALLY across selected servers ===
            console.log('📤 Distributing', prompts.length, 'prompts equally across', selectedLLMServers.length, 'LLM servers');

            const numServers = selectedLLMServers.length;
            const promptsPerServer = Math.ceil(prompts.length / numServers);

            // Split prompts evenly
            for (let i = 0; i < numServers; i++) {
                const start = i * promptsPerServer;
                const end = Math.min(start + promptsPerServer, prompts.length);
                const serverPrompts = prompts.slice(start, end);

                if (serverPrompts.length === 0) break;

                const serverId = selectedLLMServers[i];

                console.log(`   → Server ${i+1}/${numServers} (${serverId}) gets ${serverPrompts.length} prompts`);

                try {
                    await fetch('/api/distribute_prompts', {
                        method: 'POST',
                        headers: { 'Content-Type': 'application/json' },
                        body: JSON.stringify({
                            prompts: serverPrompts,
                            selected_agents: selectedMCPAgents,
                            selected_servers: [serverId],           // one server at a time
                            distribution_mode: "ALL_GET_ALL",
                            max_tokens: 2048
                        })
                    });
                } catch (e) {
                    console.error('❌ Failed to send batch to server', serverId, e);
                }
            }

            alert(`✅ Prompts distributed equally across ${numServers} selected LLM server${numServers > 1 ? 's' : ''}!`);
            switchToActivePromptsTab();
            updateDashboard();
        }
        // Robust real-time SSE cleaner – separates content and thinking
        // Robust real-time SSE cleaner – separates content and thinking
        function cleanLLMResponse(rawText) {
            if (!rawText) return { content: '', thinking: '' };

            let content = '';
            let thinking = '';

            const lines = rawText.split('\n');
            for (let line of lines) {
                line = line.trim();
                if (!line || line === 'data: [DONE]' || line === 'data: {}') continue;

                if (line.startsWith('data: ')) {
                    const jsonStr = line.substring(6).trim();
                    if (!jsonStr) continue;

                    try {
                        const parsed = JSON.parse(jsonStr);

                        const deltaContent = parsed.choices?.[0]?.delta?.content ||
                                            parsed.choices?.[0]?.message?.content || '';
                        if (deltaContent) content += deltaContent;

                        const reasoning = parsed.choices?.[0]?.delta?.reasoning_content ||
                                         parsed.choices?.[0]?.delta?.thinking ||
                                         parsed.thinking || '';
                        if (reasoning) thinking += reasoning;

                    } catch (e) {
                        content += jsonStr + ' ';
                    }
                }
            }

            return {
                content: content.trim(),
                thinking: thinking.trim()
            };
        }
        async function updateDashboard() {
    try {
        const res = await fetch('/api/status');
        if (!res.ok) throw new Error('Failed to fetch status');
        const data = await res.json();

        currentAgents = data.mcp_agents || [];
        currentServers = data.llm_servers || [];
        renderAgentsTable();
        renderServersTable();
        renderAgentSelection();
        renderServerSelection();

        // ==================== ACTIVE PROMPTS RENDERING ====================
        let activeHtml = '';
        let hasPrompts = false;

        // Persistent dropdown states
        if (typeof window.dropdownStates === 'undefined') {
            window.dropdownStates = {};
        }

        document.querySelectorAll('#active-content details').forEach(details => {
            const id = details.getAttribute('data-id');
            if (id) window.dropdownStates[id] = details.open;
        });

        for (const [key, info] of Object.entries(data)) {
            if (['mcp_agents', 'llm_servers', 'scheduled_tasks'].includes(key)) continue;
            if (!info || !info.requests || !info.requests.length) continue;

            hasPrompts = true;
            activeHtml += `<div class="card request-card mb-4">
                <div class="card-header bg-white">
                    <h5>${key} <span class="badge bg-info">${info.active_requests || 0} active</span></h5>
                </div>
                <div class="card-body">`;

            info.requests.forEach((req) => {
                const progress = req.status === 'completed' ? 100 : (req.status === 'monitoring' ? 95 : 0);
                const statusClass = req.status === 'completed' ? 'success' : req.status === 'monitoring' ? 'primary' : 'secondary';

                const parsed = cleanLLMResponse(req.full_response || '');
                const renderedMarkdown = marked.parse(parsed.content || 'Waiting for LLM response...');
                const renderedPrompt = marked.parse(req.prompt_preview || 'No prompt text available');
                const renderedThinking = parsed.thinking ? marked.parse(parsed.thinking) : '';

                const promptId = `prompt-${key}-${req.request_id}`;
                const thinkingId = `thinking-${key}-${req.request_id}`;
                const responseId = `response-${key}-${req.request_id}`;

                const promptOpen = window.dropdownStates[promptId] !== false;
                const thinkingOpen = window.dropdownStates[thinkingId] !== false;
                const responseOpen = window.dropdownStates[responseId] !== false;

                activeHtml += `
                    <div class="mb-4 border rounded">
                        <div class="p-3 d-flex justify-content-between align-items-center">
                            <strong>${req.request_id}</strong>
                            <span class="badge bg-${statusClass}">${req.status.toUpperCase()}</span>
                        </div>

                        <details class="px-3 pb-3" data-id="${promptId}" ${promptOpen ? 'open' : ''}>
                            <summary class="d-flex justify-content-between align-items-center text-muted small cursor-pointer" style="list-style:none;">
                                <span>Full Prompt Text</span>
                                <span class="badge bg-light text-dark">▼</span>
                            </summary>
                            <div class="mt-2 p-3 bg-light rounded border markdown-body">
                                ${renderedPrompt}
                            </div>
                        </details>

                        ${renderedThinking ? `
                        <details class="px-3 pb-3" data-id="${thinkingId}" ${thinkingOpen ? 'open' : ''}>
                            <summary class="d-flex justify-content-between align-items-center text-muted small cursor-pointer" style="list-style:none;">
                                <span>Thinking / Reasoning</span>
                                <span class="badge bg-light text-dark">▼</span>
                            </summary>
                            <div class="mt-2 p-3 bg-white border rounded markdown-body">
                                ${renderedThinking}
                            </div>
                        </details>` : ''}

                        <details class="px-3 pb-3" data-id="${responseId}" ${responseOpen ? 'open' : ''}>
                            <summary class="d-flex justify-content-between align-items-center text-muted small cursor-pointer" style="list-style:none;">
                                <span>LLM Response (Live)</span>
                                <span class="badge bg-light text-dark">▼</span>
                            </summary>
                            <div class="mt-2 p-3 bg-white border rounded markdown-body" style="min-height: 120px;">
                                ${renderedMarkdown}
                            </div>
                        </details>

                        <div class="px-3 pb-3">
                            <div class="progress mt-2" style="height:10px;">
                                <div class="progress-bar bg-primary" style="width:${progress}%"></div>
                            </div>
                            <div class="row mt-3 text-center small">
                                <div class="col"><strong>${(req.total_bytes || 0).toLocaleString()}</strong><br>Bytes</div>
                                <div class="col"><strong>${(req.estimated_tokens || 0).toLocaleString()}</strong><br>Tokens</div>
                                <div class="col"><strong>${(req.duration || 0).toFixed(1)}s</strong><br>Duration</div>
                            </div>
                        </div>
                    </div>`;
            });

            activeHtml += `</div></div>`;
        }

        if (!hasPrompts) {
            activeHtml = `<div class="text-center py-5 text-muted">
                <i class="bi bi-inbox display-4 mb-3 d-block"></i>
                No active prompts at the moment.
            </div>`;
        }

        document.getElementById('active-content').innerHTML = activeHtml;

        // ==================== DONE WALK QUEUE TAB ====================
        const queueHtml = doneWalkQueue.length > 0 ?
            `<div class="alert alert-info">
                <strong>Done-Walk Queue (${doneWalkQueue.length} remaining)</strong>
                <ul class="list-group mt-2">
                    ${doneWalkQueue.map((item, i) => {
                        const preview = item.prompt.substring(0, 80);
                        return `<li class="list-group-item">
                            <strong>${i+1}.</strong> ${preview}${item.prompt.length > 80 ? '...' : ''}<br>
                            <span class="badge bg-primary">Trigger: <code>${item.trigger}</code></span>
                        </li>`;
                    }).join('')}
                </ul>
            </div>` :
            `<div class="text-center py-5 text-muted">
                <i class="bi bi-inbox display-4 mb-3 d-block"></i>
                No prompts in Done-Walk queue.
            </div>`;

        document.getElementById('done-walk-content').innerHTML = queueHtml;

        document.getElementById('last-updated').textContent = `Last updated: just now`;
    } catch (e) {
        console.error(e);
        document.getElementById('active-content').innerHTML = `<div class="alert alert-danger">Failed to load data from server.</div>`;
    }
}


    document.addEventListener('DOMContentLoaded', function () {
        console.log('🚀 Page loaded – fetching all data from backend');

        // 1. Load everything on first page load
        updateDashboard();

        // 2. Refresh when user switches to MCP Agents or LLM Servers tab
        const mcpTabBtn = document.querySelector('button[data-bs-target="#mcp-agents"]');
        const serversTabBtn = document.querySelector('button[data-bs-target="#llm-servers"]');

        if (mcpTabBtn) {
            mcpTabBtn.addEventListener('shown.bs.tab', () => {
                console.log('MCP Agents tab opened – refreshing data');
                updateDashboard();
            });
        }
        if (serversTabBtn) {
            serversTabBtn.addEventListener('shown.bs.tab', () => {
                console.log('LLM Servers tab opened – refreshing data');
                updateDashboard();
            });
        }

        // Optional: live refresh every 8 seconds (uncomment if desired)
        // setInterval(updateDashboard, 8000);
    });

    // Auto-refresh every 800ms so statistics and LLM answer flow live
    setInterval(updateDashboard, 800);
    </script>
</body>
</html>

We recommend doing this inside pycharm. For the less initiated it will handle your environment setup etc.

One it runs it will show this in the console:

You simply access it at it's end-point golden!!