147 GUIDES AND GROWING

Run AI Locally.
Free. Private. Yours.

Step-by-step guides for Mac, Windows & Linux — no cloud, no API bills, and total control over your intelligence.

macOS Windows Linux

Start Here

Where do you want to run AI?

Pick your setup below — we'll take you straight to the right guide.

Mac · Apple Silicon

Apple Silicon
(M1 / M2 / M3)

Best-in-class unified memory. Run 7B–13B models blazing-fast with Ollama.

Browse Mac Guides arrow_forward

Windows · NVIDIA

Windows +
NVIDIA GPU

CUDA-accelerated inference with LM Studio. RTX 3060+ recommended.

Browse Windows Guides arrow_forward

Linux · Power Users

Linux +
Max Performance

Full control with llama.cpp, Ollama, or any backend. CPU & GPU options.

Browse Linux Guides arrow_forward

MAC 9 MIN READ FEATURED

The 8GB Mac Survival Guide for Local AI

Can you run AI on an 8GB M1 or M2? Yes. Here are the best models and settings to avoid swap memory death.

Arjun Mehta

Core Contributor

Read Guide arrow_forward

$ curl -fsSL https://ollama.com/install.sh | sh

# Downloading Ollama...

# Setting up environment variables...

$ ollama run llama3.1

> Pulling manifest...

> Success

memory

All Setup Guides

MAC 9 MIN READ

The 8GB Mac Survival Guide for Local AI

Can you run AI on an 8GB M1 or M2? Yes. Here are the best models and settings to avoid swap memory death.

Read Guide arrow_forward

MAC 11 MIN READ

Replace GitHub Copilot: Ollama + Continue.dev

Stop paying $10/month. Set up Ollama and the Continue.dev extension in VS Code on your Mac for completely free, private AI autocomplete.

Read Guide arrow_forward

MAC 8 MIN READ

LM Studio on Mac: The Easiest Offline AI Interface

Install LM Studio on macOS to get a beautiful GUI for downloading and running GGUF models with Metal acceleration.

Read Guide arrow_forward

WINDOWS 6 MIN READ

Run Ollama on Windows Natively

Ollama now runs natively on Windows without WSL. Install, pull models, and chat from PowerShell in under 5 minutes.

Read Guide arrow_forward

WINDOWS 8 MIN READ

Setup LM Studio on Windows

Learn how to install and configure LM Studio on Windows with NVIDIA/AMD GPU support. Run GGUF models locally with a beautiful chat interface.

Read Guide arrow_forward

MAC 7 MIN READ

System-wide Mac AI: Connect Ollama to Raycast

Integrate your local LLMs directly into Raycast. Highlight text anywhere on your Mac and hit a hotkey to summarize or rewrite it for free.

Read Guide arrow_forward

MAC 14 MIN READ

Llama.cpp on Mac: The Power User's Guide

Compile and run llama.cpp from scratch on macOS. Get maximum performance, zero bloat, and total control over your Metal acceleration parameters.

Read Guide arrow_forward

LINUX 7 MIN READ

Run Ollama on Linux: The Definitive Guide

Deploy Ollama as a background systemd service on Ubuntu/Debian. Full setup for NVIDIA CUDA and AMD ROCm.

Read Guide arrow_forward

MAC 10 MIN READ

Apple's MLX Framework: Maximum AI Speed

How to use Apple's native MLX framework to run Llama 3 and Mistral at blistering speeds natively on Apple Silicon.

Read Guide arrow_forward

WINDOWS 14 MIN READ

Llama.cpp on Windows: The CUDA Guide

Compile llama.cpp from source on Windows using CMake and the NVIDIA CUDA toolkit for maximum token generation speed.

Read Guide arrow_forward

MAC 12 MIN READ

The Ultimate Guide: Run Ollama on Mac M3

The definitive masterclass to installing, optimizing, and running Ollama on Apple Silicon. Understand Unified Memory, model quantization, and how to maximize your M3 chip.

Read Guide arrow_forward

LINUX 6 MIN READ

Setup LM Studio on Linux (Ubuntu/Debian)

Install the LM Studio AppImage on Linux to get a beautiful graphical interface for your local AI models.

Read Guide arrow_forward

LINUX 15 MIN READ

Local Llama 3 on Linux

Deploy Meta's Llama 3 model locally on Linux using llama.cpp with full CUDA support. This guide covers compilation, quantization, and running the model from the command line.

Read Guide arrow_forward

LINUX 12 MIN READ

High-Throughput Serving with vLLM on Ubuntu

For enterprise-grade performance, deploy vLLM on Linux to serve models with PagedAttention and maximum token throughput.

Read Guide arrow_forward

ALL 8 MIN READ

Running Open-Source Coding Agents Locally in 2026: A Complete Guide

Learn how to set up and run open-source coding agents locally using Llama 3, Qwen, and Ollama. Keep your code private and avoid API costs.

Read Guide arrow_forward

NONE 5 MIN READ

None

Read Guide arrow_forward

NONE 5 MIN READ

None

Read Guide arrow_forward

Run AI Locally. Free. Private. Yours.

Where do you want to run AI?

Apple Silicon (M1 / M2 / M3)

Windows + NVIDIA GPU

Linux + Max Performance

The 8GB Mac Survival Guide for Local AI

All Setup Guides

The 8GB Mac Survival Guide for Local AI

Replace GitHub Copilot: Ollama + Continue.dev

LM Studio on Mac: The Easiest Offline AI Interface

Run Ollama on Windows Natively

Setup LM Studio on Windows

System-wide Mac AI: Connect Ollama to Raycast

Llama.cpp on Mac: The Power User's Guide

Run Ollama on Linux: The Definitive Guide

Apple's MLX Framework: Maximum AI Speed

Llama.cpp on Windows: The CUDA Guide

The Ultimate Guide: Run Ollama on Mac M3

Setup LM Studio on Linux (Ubuntu/Debian)

Local Llama 3 on Linux

High-Throughput Serving with vLLM on Ubuntu

Running Open-Source Coding Agents Locally in 2026: A Complete Guide

None

None

ChatEzzy Workspace

Run AI Locally.
Free. Private. Yours.

Apple Silicon
(M1 / M2 / M3)

Windows +
NVIDIA GPU

Linux +
Max Performance