Agentic Server Primer: Llama.cpp MCP Lesson 10: mcp-coder (Cuda Version)

We build a MCP Coding Agent that will allow your LLM to specifically work on and debug it's own code with nvcc, or really any language!

Agentic Server Primer: Llama.cpp MCP Lesson 10: mcp-coder (Cuda Version)
We give our houseLLM an Agentic Coding Function

This is a specialized MCP agent in that it is designed for your LLM to work with, pull, compile and develop it's own software - specifically for Nvidia nvcc Cuda! Β  It is effectively its entire own build agent! However the potential is utterly amazing, it can effectively have your LLM writing advanced GPU code (aka configuring it's own Llama.cpp!)

MTP / TurboQuant Forked Llama.cpp
We hot compile one of the first combo MTP / TurboQuant forks in the world!
  • In the above link we looked at one of the worlds first MTP/TurboQuant forks of Llama.cpp but stopped because it was not specifically for a Qwen3.6 which because of it's advanced nature we really wanted to keep.
  • Our goal is to see if a HouseLLM can compile a MTP (Multiple-Token-Prediction) cross-blend of TurboQuant forked Llama.cpp and get it to work with Qwen3.6!
  • This is a very challenging MCP Agent that we had to rewrite close to a dozen times. In the end it worked! The issue is that mcp calls can often fail and finally every single mcp end point required using two formats. This gives higher compatibility with many calling LLM's, thusly:

All tools now support both calling styles:

  • Normal parameters: read_file("script.py")
  • Dictionary input: read_file({"file_path": "script.py"})

If you want to simply pull and run this mcp-coder:

#!/bin/bash

CONTAINER_NAME="mcp-coder"

# Cleanup previous container
docker stop $CONTAINER_NAME 2>/dev/null
docker rm $CONTAINER_NAME 2>/dev/null

# Ensure workspace directory exists
mkdir -p ~/mcp-workspace
chmod -R 755 ~/mcp-workspace

# Pull latest version
docker pull cnmcdee/mcp-coder:latest

# Start container
docker run -d \
--name $CONTAINER_NAME \
--restart unless-stopped \
-p 5011:5011 \
-v ~/mcp-workspace:/work_path \
--env PYTHONUNBUFFERED=1 \
cnmcdee/mcp-coder:latest

echo "βœ… MCP Coder container started successfully!"
echo "🌐 Access URL: http://localhost:5011/mcp"
echo "πŸ“‹ Logs: docker logs -f $CONTAINER_NAME"

We used Β python as a pass-through, with it's own CORS http point. Β There are a couple paths one can take when you are making these, namely:

  • You give Β it a number of guided mcp command end-points representing system operations like 'git push' , 'git pull' Β - but then you need to do it for all of them, or:
  • Or you simply give it the ability to read, write, open files, and execute system commands knowing it has the agentic ability to conceptually understand what it is doing. Β We choose this path because these LLM's are simply that good.

Source Code

from fastmcp import FastMCP
from starlette.middleware import Middleware
from starlette.middleware.cors import CORSMiddleware
import uvicorn
import subprocess
import os
from pathlib import Path
import re
import json

# =============================================================================
# Enhanced MCP Server - Restricted to Specific Target Directory
# Optimized for reliable tool calling by Qwen3.6 and other LLMs
# =============================================================================

mcp = FastMCP(name="Target Directory MCP Server")

# ── Configuration ────────────────────────────────────────────────────────────

# CHANGE THIS TO YOUR DESIRED TARGET DIRECTORY
BASE_DIRECTORY = "/work_path"

# Security limits
MAX_FILE_SIZE = 10000 * 1024 * 1024      # 10 GB
MAX_OUTPUT_LENGTH = 50000000000           # 50 KB output limit

# All operations are forced inside BASE_DIRECTORY
ALLOWED_DIRECTORIES = [BASE_DIRECTORY]


# ── Helper Functions ─────────────────────────────────────────────────────────

def validate_path(file_path):
    """Force all paths to be inside the BASE_DIRECTORY."""
    if not os.path.isabs(file_path):
        file_path = os.path.join(BASE_DIRECTORY, file_path)

    path = Path(file_path).resolve()
    base_path = Path(BASE_DIRECTORY).resolve()

    if not str(path).startswith(str(base_path)):
        raise ValueError(f"Path must be inside the target directory: {BASE_DIRECTORY}")

    if '..' in path.parts:
        raise ValueError("Path traversal detected")

    return path


def truncate_output(output, max_length=MAX_OUTPUT_LENGTH):
    """Truncate output if it exceeds the maximum length."""
    if len(output) <= max_length:
        return output

    truncated = output[:max_length]
    last_newline = truncated.rfind('\n')
    if last_newline > max_length * 0.8:
        truncated = truncated[:last_newline + 1]

    return truncated + f"\n\n[Output truncated to {max_length} characters]"


# ── File System Tools ────────────────────────────────────────────────────────

@mcp.tool
def read_file(file_path):
    """
    Read the entire content of a file inside the target directory.

    Parameters:
    - file_path (string): Relative or absolute path to the file.
                         Example: "script.py" or "folder/subfolder/file.txt"

    Returns: The full text content of the file as a string.
    """
    try:
        path = validate_path(file_path)

        if not path.is_file():
            return f"Error: File not found: {file_path}"

        file_size = path.stat().st_size
        if file_size > MAX_FILE_SIZE:
            return f"Error: File too large ({file_size} bytes). Maximum allowed: {MAX_FILE_SIZE} bytes."

        return path.read_text(encoding="utf-8")

    except Exception as e:
        return f"Error reading file: {str(e)}"


@mcp.tool
def write_file(file_path, content, mode="w"):
    """
    Write or append text content to a file.

    Parameters:
    - file_path (string): Path to the file (relative or absolute).
    - content (string): The text you want to write.
    - mode (string): "w" to overwrite (default) or "a" to append.

    Example:
    write_file("notes.txt", "Hello world", "w")
    """
    if mode not in ["w", "a"]:
        return "Error: mode must be 'w' or 'a'"

    try:
        path = validate_path(file_path)
        path.parent.mkdir(parents=True, exist_ok=True)

        with open(path, mode, encoding="utf-8") as f:
            f.write(content)

        action = "overwritten" if mode == "w" else "appended to"
        return f"Successfully {action}: {file_path}"

    except Exception as e:
        return f"Error writing file: {str(e)}"


@mcp.tool
def delete_file(file_path):
    """
    Delete a file.

    Parameters:
    - file_path (string): Path to the file to delete.
    """
    try:
        path = validate_path(file_path)

        if not path.is_file():
            return f"Error: File not found: {file_path}"

        path.unlink()
        return f"Successfully deleted: {file_path}"

    except Exception as e:
        return f"Error deleting file: {str(e)}"


@mcp.tool
def replace_line(file_path, line_number, new_content):
    """
    Replace a specific line in a file by line number.

    Parameters:
    - file_path (string): Path to the file.
    - line_number (integer): Line number to replace (starts at 1).
    - new_content (string): New text for that line.

    Example: replace_line("main.py", 42, "    print('Updated')")
    """
    try:
        path = validate_path(file_path)

        if not path.is_file():
            return f"Error: File not found: {file_path}"

        lines = path.read_text(encoding="utf-8").splitlines(keepends=True)

        if line_number < 1 or line_number > len(lines):
            return f"Error: Line number {line_number} is out of range. File has {len(lines)} lines."

        original_ending = lines[line_number - 1][-1:] if lines[line_number - 1] else '\n'
        lines[line_number - 1] = new_content.rstrip() + original_ending

        path.write_text(''.join(lines), encoding="utf-8")
        return f"Successfully replaced line {line_number} in {file_path}"

    except Exception as e:
        return f"Error replacing line: {str(e)}"


@mcp.tool
def list_directory(directory="."):
    """
    List files and folders in a directory.

    Parameters:
    - directory (string): Optional. Directory to list. Default is current directory ".".

    Returns: List of strings in format "D/foldername" or "F/filename".
    """
    try:
        path = validate_path(directory)

        if not path.is_dir():
            return ["Error: Not a directory"]

        items = []
        for item in sorted(path.iterdir()):
            prefix = "D/" if item.is_dir() else "F/"
            items.append(f"{prefix}{item.name}")

        return items

    except Exception as e:
        return [f"Error: {str(e)}"]


@mcp.tool
def get_file_info(file_path):
    """
    Get detailed information about a file or directory.

    Parameters:
    - file_path (string): Path to the file or folder.
    """
    try:
        path = validate_path(file_path)

        if not path.exists():
            return {"error": f"Path not found: {file_path}"}

        stat = path.stat()
        return {
            "path": str(path),
            "exists": True,
            "is_file": path.is_file(),
            "is_dir": path.is_dir(),
            "size_bytes": stat.st_size,
            "modified_timestamp": stat.st_mtime,
            "permissions": oct(stat.st_mode)[-3:],
            "name": path.name
        }

    except Exception as e:
        return {"error": str(e)}


@mcp.tool
def search_files(directory=".", pattern="*"):
    """
    Recursively search for files matching a pattern.

    Parameters:
    - directory (string): Starting directory. Default ".".
    - pattern (string): Glob pattern. Examples: "*.py", "*.txt", "config*.json"

    Returns: List of matching file paths.
    """
    try:
        path = validate_path(directory)

        if not path.is_dir():
            return [f"Error: Not a directory: {directory}"]

        matches = [str(item) for item in path.rglob(pattern) if item.is_file()]
        return sorted(matches)

    except Exception as e:
        return [f"Error: {str(e)}"]


# ── Shell Command Tool ───────────────────────────────────────────────────────

@mcp.tool
def run_command(command, cwd=None, timeout=180):
    """
    Execute a shell command inside the restricted /work_path directory.

    Parameters:
    - command: Can be either:
        - A string: "ls -la"
        - Or a dictionary: {"command": "ls -la", "cwd": "subfolder", "timeout": 60}
    - cwd (string, optional): Working directory.
    - timeout (integer, optional): Maximum time in seconds.

    Returns: Detailed output including STDOUT, STDERR, and return code.
    """
    try:
        # === Handle dictionary input (for models that pass one dict) ===
        if isinstance(command, dict):
            data = command
            command = data.get("command") or data.get("cmd")
            cwd = data.get("cwd") or cwd
            timeout = data.get("timeout") or timeout

        # Ensure command is a string
        if not isinstance(command, str):
            return f"Error: 'command' must be a string or a dict containing 'command'. Got: {type(command)}"

        # Set default working directory
        if cwd is None:
            cwd = BASE_DIRECTORY
        else:
            cwd_path = validate_path(cwd)
            cwd = str(cwd_path)

        # Basic safety check
        dangerous = ["&&", ";", "|", ">", "<", ">>", "sudo", "su ", "rm -rf /", "mkfs", "shutdown"]
        for pattern in dangerous:
            if pattern in command.lower():
                return f"Error: Dangerous command pattern detected: '{pattern}'"

        # Execute command
        result = subprocess.run(
            command,
            shell=True,
            cwd=cwd,
            capture_output=True,
            text=True,
            timeout=timeout
        )

        stdout = truncate_output(result.stdout)
        stderr = truncate_output(result.stderr)

        # Build response
        response_parts = [
            f"Command: {command}",
            f"Working directory: {cwd}",
            f"Return code: {result.returncode}"
        ]

        if stdout.strip():
            response_parts.append(f"\nSTDOUT:\n{stdout}")
        else:
            response_parts.append("\nSTDOUT: (no output)")

        if stderr.strip():
            response_parts.append(f"\nSTDERR:\n{stderr}")
        else:
            response_parts.append("\nSTDERR: (no output)")

        return "\n".join(response_parts)

    except subprocess.TimeoutExpired:
        return f"Error: Command timed out after {timeout} seconds."
    except Exception as e:
        return f"Error executing command: {str(e)}"
# ── Server Setup ─────────────────────────────────────────────────────────────

if __name__ == "__main__":
    middleware = [
        Middleware(
            CORSMiddleware,
            allow_origins=["*"],
            allow_credentials=False,
            allow_methods=["GET", "POST", "OPTIONS"],
            allow_headers=["*"],
            expose_headers=["*"],
            max_age=3600,
        )
    ]

    app = mcp.http_app(
        path="/mcp",
        middleware=middleware
    )

    print("πŸš€ Starting Target Directory Restricted MCP Server")
    print(f"β†’ All operations restricted to: {BASE_DIRECTORY}")
    print("β†’ Tools available: read_file, write_file, delete_file, replace_line,")
    print("                   list_directory, get_file_info, search_files, run_command")

    uvicorn.run(app, host="0.0.0.0", port=5011, log_level="info")

Here are the complete files you need to run the MCP Server in Docker:

1. requirements.txt

fastmcp
uvicorn[standard]
starlette

2. Dockerfile

FROM nvidia/cuda:13.1.2-devel-ubuntu22.04

WORKDIR /app

ENV DEBIAN_FRONTEND=noninteractive \
    TZ=UTC \
    PYTHONUNBUFFERED=1

# System packages + Python
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    software-properties-common sudo git build-essential cmake ninja-build curl wget ca-certificates tzdata && \
    add-apt-repository ppa:deadsnakes/ppa -y && \
    apt-get update && \
    apt-get install -y --no-install-recommends \
    python3.11 python3.11-venv python3.11-dev python3-pip && \
    rm -rf /var/lib/apt/lists/* && \
    update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 1 && \
    update-alternatives --install /usr/bin/python python /usr/bin/python3.11 1 && \
    update-alternatives --install /usr/bin/pip pip /usr/bin/pip3 1

# Create work directory
RUN mkdir -p /work_path && chown -R 1000:1000 /work_path && chmod -R 755 /work_path

COPY requirements.txt .
RUN python -m pip install --no-cache-dir -r requirements.txt

COPY server.py .

RUN useradd -m -u 1000 -s /bin/bash mcpuser && \
    echo "mcpuser ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers && \
    chown -R mcpuser:mcpuser /app

USER mcpuser

EXPOSE 5011

# More explicit CMD + shell wrapper for better error messages
CMD ["python", "server.py"]

3. docker-compose.yml

version: '3.9'

services:
  mcp-server:
    build: .
    container_name: mcp-server
    restart: unless-stopped
    ports:
      - "5011:5011"
    volumes:
      # Mount your target project directory into the container
      - ./target-project:/app/project
    environment:
      - PYTHONUNBUFFERED=1
    # Optional: Add GPU support if you need CUDA/nvcc inside the container
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: all
    #           capabilities: [gpu]

How to Use

Save the files:

  • requirements.txt
  • Dockerfile
  • docker-compose.yml

Rename your server code to server.py (or update the Dockerfile accordingly).

Update the target directory in server.py:

BASE_DIRECTORY = "/app/project"   # This matches the volume mount

Create the project folder (next to docker-compose.yml):

mkdir target-project

Start the server:

docker-compose up --build

Or in detached mode:

docker-compose up -d --build

Once its Working the Fun Really Starts!

Using Target Directory Server explore all its available tools and test their functionality. With it and the Process Manager tool the goal is to make a combined llama.cpp that combines these two repositories

TurboQuant Base: https://github.com/TheTom/llama-cpp-turboquant.git (use the feature/turboquant-kv-cache branch if available, otherwise main)
MTP + TurboQuant Combined Variant: https://github.com/AtomicBot-ai/atomic-llama-cpp-turboquant.git (this fork already includes both TurboQuant and Gemma-4-style MTP support)  The objective is to specifically create a Llama.cpp that can do BOTH MTP AND TUROQUANT specfically for Qwen3.6

You can install upgrade or do whatever you need inside the Target Directory Server, including git pull, and it already has a full build environment with nvcc. As you go document and save your progress in detail to the Process Manager with frequent updates to your tasks. Make sure the tasks are detailed enough that if you cannot complete this you can do it again on the next task. make sure it can compile and fix anything that won't.
Linux Rocks Every Day