Nvidia Driver cuda nvcc Troubleshooting We had so many issues troubleshooting and getting our Nvidia drivers to work with our particular kernel of Linux (ParrotOS latest) that we wrote this guide. Pretty much every LLM needs the full suite of drivers/nvcc etc so this guide might help you. You can ask a SOTA level
MTP World First! The Tom Pulls TurboQuant w/MTP (and it Works!) A world first! TurboQuant + MTP support from the same LLama.cpp! What a game changer!
MTP Into the MTP Zone.. A Look at Multi-Token-Prediction - It Rocks! We foray into MTP (Multiple Token Prediciton)
benchmarking LocalLLM BenchMaxxing! How to Benchmark llama.cpp and power juicing your localLLMs! We go over benchmaxxing your localLLM in a custom Llama.cpp!
localLLM LLMQP Drops! A New Queue Dispatcher. Let your LLM CODE ALL NIGHT. LLM Queue Dispatcher. A Powerful Harness Drop will queue your localLLM all night and keep it working!
MCP Server Agentic Server Primer: Llama.cpp MCP Lesson 10: mcp-coder (Cuda Version) We build a MCP Coding Agent that will allow your LLM to specifically work on and debug it's own code with nvcc, or really any language!
MTP MTP / TurboQuant Forked Llama.cpp We hot compile one of the first combo MTP / TurboQuant forks in the world!
docker-compose.yml docker-compose.yml -> docker run Converter This page is a bookmarker. Need a docker-compose.yml converted to a docker run command on the fly? Here you go!
docker Agentic Server Primer: Llama.cpp MCP Lesson 9: Docker Orchestrator In this guide we go over letting your llm manage and create it's own docker images, stand up it's own containers after writing it's code. It uses a special docker-compose tool we built for it.
studentLLM StudentLLM - Qwen2.5-coder-7b-instruct-q6-k / Qwen3.5 Agentic on a Ryzen 5-2600/ 3060ti. Production LLM or not? YES! We Look a StudentLLM setup to get as much productivity out of limited hardware as we can.
One-shot Qwen3.6 Drops!- A HouseLLM Production Level Coding Perspective? One-Shot GoAccess We Test Qwen3.6 if it is up to your home production standards.
docker PowerChest home Agent Agents MCP Tools to Put your HouseLLM into Turbo. MCP TOOLS 1-9++ Downloads Page for all your MCP tooling needs!
House LLM Agentic Server Primer: Llama.cpp MCP Lesson 8: Process Manager Web Enabled Research Assistant w/ Code Drop. Agentic Server Primer: Llama.cpp MCP Lesson 8: Process Manager Web Enabled Research Assistant
agentic server Agentic Server Primer: Llama.cpp MCP Lesson 7: Process Manager (Part 1) Agentic Server Primer: Llama.cpp MCP Lesson 7: Process Manager (Part 1)
docker Agentic Server Primer: Llama.cpp MCP Lesson 6: Adding mysql Database Docker Toolsets. We give our LLM it's own Database to play with!
javascript Agentic Server Primer: Llama.cpp MCP Lesson 5: Adding javascript via a Python api plugin. We go through a full working example of creating your own MCP tools.
Qwen3-Coder Qwen3-Coder-Next-UD-Q4-K_XL.gguf on a Ryzen 9/4080ti. Run a 48GB SOTA Tensor-Balanced on a $2K set of House Parts. We field test a Qwen3-Coder-Next-UD-Q4_K_XL.gguf
llmfit llmfit - Fast LLM Metric Fitter and Pulling Tool We have a look at a fast fitting tool for comparing our hardware to the LLM market.
Qwen3.5 Qwen3.5-122B-A10B-Q4_K_M.gguf - Run it at 13 Tokens/s with 262,000 Contexts on a Ryzen 9 3900 and a 4080ti. w/128GB RAM. Qwen3.5-122B-A10B-Q4_K_M.gguf - Run it at 13 Tokens/s with 262,000 Contexts on a Ryzen 9 3900 and a 4080ti. w/128GB RAM.
agentic Agentic Server Primer: Llama.cpp MCP Lesson 4: Weather Polling via api.weather.gov Agentic Server Primer: Llama.cpp MCP Lesson 4: Weather Polling via api.weather.gov
House LLM Agentic Server Primer: Llama.cpp MCP Lesson 3: Adding Python Tooling Capability To your HouseLLM. Agentic Server Primer: Llama.cpp MCP Lesson 3: Python
docker Agentic Server Primer: Llama.cpp MCP Lesson 2: Dockerization. Agentic Server Primer: Llama.cpp MCP Lesson 2: Dockerization.
agentic Agentic Server Primer: Llama.cpp MCP Lesson 1: A Calculator. Agentic Server Primers: Llama.cpp MCP Lession 1: A Calculator
TurboQuant The TurboQuant BreakThrough. Running Qwen3.5-35B-A3B-UD-IQ3_XXS.gguf with a Hermes Agent. Get $80K Enterprise Server Performance on a $800 House GPU. The TurboQuant BreakThrough. Running Qwen3.5-35B-A3B-UD-IQ3_XXS.gguf with a Hermes Agent. Get $80K Enterprise Server Performance on a House GPU.