System-wide Mac AI: Connect Ollama to Raycast

laptop_mac macOS Sonoma Intermediate schedule 8 min read
Author by Alex Rivera • May 14, 2024

Step 1 Unleashing AI System-Wide on macOS

The modern macOS power user operates in a fragmented AI landscape — browser tabs pinned to ChatGPT, separate apps for writing assistance, and constant context-switching that destroys flow state. What if your AI wasn't siloed in a browser window but woven directly into the fabric of your operating system, available the instant you need it, responding in milliseconds, with zero data leaving your machine?

That's exactly what happens when you connect Ollama to Raycast. The combination is genuinely transformative for how you interact with your Mac.

Why This Architecture Is Different

Most AI integrations follow the same pattern: open an app, type a prompt, wait for a cloud response, copy the output. This workflow introduces cognitive overhead at every step. Raycast + Ollama breaks this pattern entirely by positioning AI as a first-class system primitive.

Approach Latency Privacy Offline Support Context Awareness
Browser-based ChatGPT High (network round-trip) ❌ Data sent to OpenAI ❌ Requires internet Limited
Native AI apps Medium Varies Sometimes Minimal
Raycast + Ollama Ultra-low (localhost) ✅ 100% local ✅ Fully offline Deep (system-wide)

What Ollama Brings to the Equation

Ollama is a lightweight inference server that runs large language models locally on Apple Silicon. It exposes a clean REST API on http://localhost:11434, making it trivially easy for other tools to consume. Models like Llama 3, Mistral, Phi-3, and Gemma 2 run with impressive speed on M-series chips, often matching or exceeding cloud model response times for typical tasks.

Terminal
# Verify Ollama is running and listening
curl http://localhost:11434/api/tags

# Expected output: a JSON list of your locally installed models

Once Ollama is running as a background service, it becomes a persistent AI backbone that any properly configured application can query — including Raycast.

What Raycast Brings to the Equation

Raycast is a system-wide command launcher that has effectively replaced Spotlight for hundreds of thousands of Mac users. Its extension ecosystem allows it to integrate with virtually anything, and its AI command framework enables you to pipe selected text, clipboard contents, or free-form prompts directly into any LLM endpoint.

The critical insight here: Raycast commands are available from anywhere in macOS. Whether you're in Xcode reviewing a function, in Notion drafting a document, in Slack composing a message, or in Terminal debugging a script — a single hotkey invocation brings AI to whatever you're working on, without switching apps.

The Power of Composability

What makes this setup genuinely powerful isn't any single feature — it's composability. You can:

  • Select code in any editor → invoke a Raycast AI command → get an explanation injected into your clipboard
  • Highlight a dense paragraph in Safari → summarize it in plain language without leaving the page
  • Grab an error message from Terminal → run it through a debugging prompt → paste the fix back immediately

This is the difference between AI as a tool you visit and AI as a capability you carry. The sections that follow will walk you through assembling this system from scratch, from installing the necessary Raycast extension to configuring hotkeys that make the entire workflow feel like a native OS feature.

Note: Everything in this guide runs entirely on-device. No API keys, no subscriptions, no telemetry. Your prompts and responses never leave your Mac.

Step 2 Prerequisites: Raycast Setup

Before diving into the Ollama integration, you need a properly configured Raycast environment. Skipping this foundation will cause friction later—so let's get this right from the start.

What You'll Need

Requirement Version Notes
Raycast 1.50.0+ Pro plan required for AI features
macOS 12 Monterey+ Ventura or Sonoma strongly recommended
Ollama 0.1.20+ Must be running as a local service
RAM 8GB minimum 16GB+ recommended for larger models

Installing Raycast

If you haven't already installed Raycast, it's a straightforward process. Download the latest stable release directly from raycast.com or install via Homebrew:

Terminal
brew install --cask raycast

Once installed, launch Raycast and complete the initial onboarding. Replace Spotlight immediately—this is non-negotiable for the workflow we're building. Navigate to:

Terminal
System Settings → Keyboard → Keyboard Shortcuts → Spotlight

Uncheck Show Spotlight search from ⌘Space, then assign ⌘Space inside Raycast's preferences under General → Raycast Hotkey.


Verifying Your Raycast Version

The Ollama extension requires Raycast's extension API to support custom model endpoints. Open the Raycast preferences and confirm your build:

Terminal
Raycast → About Raycast → Build Number

Alternatively, run this quick check from your terminal:

Terminal
defaults read com.raycast.macos CFBundleShortVersionString

If you're behind on versions, the built-in updater will handle this:

Terminal
Raycast → Check for Updates

Enabling Extensions in Raycast

By default, Raycast's extension store is accessible, but you want to ensure the Extensions tab is unlocked and the store is reachable. Open Raycast preferences with ⌘, and confirm you can navigate to the Extensions tab without error.

Critical setting to enable before proceeding:

Navigate to Raycast Preferences → Extensions and ensure "Allow Extension Installation from Store" is toggled on. Without this, the Ollama extension installation in the next step will silently fail.


Confirming Ollama Is Running

Raycast needs a live Ollama instance to communicate with. Before configuring anything inside Raycast, verify Ollama is active and responsive:

Terminal
# Check if Ollama is running
curl http://localhost:11434/api/tags

# Expected response (example)
{
  "models": [
    {
      "name": "llama3:latest",
      "modified_at": "2024-01-15T10:30:00Z",
      "size": 4661211584
    }
  ]
}

If the curl command times out or returns a connection error, start Ollama manually:

Terminal
ollama serve

Pro tip: Add Ollama to your macOS login items so it starts automatically. Navigate to System Settings → General → Login Items and add the Ollama application. This ensures Raycast always has a model backend available the moment you wake your machine.


Network Permissions

macOS will prompt for network access permissions the first time Raycast attempts to reach your local Ollama instance. Click "Allow" without hesitation—this is localhost communication, not external network access. If you accidentally denied the prompt, reset it via:

Terminal
tccutil reset All com.raycast.macos

With these prerequisites locked in, your environment is primed and ready for the extension installation.

Step 3 Step 1: Installing the Raycast Ollama Extension

With Ollama running locally and Raycast installed, the bridge between your system-wide launcher and your local AI models is a single extension. This section walks you through the precise installation process, ensuring zero ambiguity at every step.

Finding the Extension in the Raycast Store

Raycast maintains a curated extension marketplace accessible directly from the app itself. Here's how to navigate to it:

  1. Open Raycast with your configured hotkey (default: ⌥ Space)
  2. Type "Store" and select Raycast Store
  3. In the search bar, type Ollama
  4. Locate the extension titled "Ollama AI" — authored by the community and vetted by the Raycast team

Alternatively, you can install it directly from the web:

Terminal
https://www.raycast.com/massimiliano_pasquini/raycast-ollama

Click "Install Extension" on the web page, and Raycast will deep-link directly into the installation prompt on your machine.


Installing via Raycast CLI (Power User Method)

If you prefer terminal-based workflows, Raycast supports extension management through its CLI toolchain. First, ensure you have the Raycast CLI installed:

Terminal
# Install Raycast CLI via npm
npm install -g @raycast/api

# Verify installation
raycast --version

Note: The CLI method is primarily intended for extension development, not end-user installation. For production use, the Store UI method is recommended.


Extension Installation Walkthrough

Once you've clicked Install from either the Store UI or the web portal, you'll see the following permission prompt in Raycast:

Permission Requested Reason
Network Access Communicates with Ollama's local HTTP API
Clipboard Read/Write Enables text transformation commands
System Services Allows AI responses to inject into active apps

Accept all permissions — none of these reach the internet. Every request routes to localhost only, meaning your data never leaves your machine.


Verifying the Installation

After installation, confirm the extension is live:

  1. Open Raycast (⌥ Space)
  2. Type Ollama — you should immediately see a cluster of commands appear: - Ollama: Chat - Ollama: Ask - Ollama: Summarize - Ollama: Fix Grammar
  3. Select Ollama: Chat and hit

If Ollama is running (ollama serve in your terminal), you'll drop directly into an interactive chat session. If you see a connection error, verify the daemon is active:

Terminal
# Check if Ollama is running
curl http://localhost:11434/api/tags

# Expected output (truncated):
# {"models":[{"name":"llama3.2:latest",...}]}

# If not running, start it:
ollama serve

Pulling Your First Model (If Not Done Already)

The extension requires at least one model to be pulled locally before commands execute. A fast, capable default for general use:

Terminal
# Lightweight and fast — ideal for system-wide commands
ollama pull llama3.2

# Higher capability for complex tasks (requires more RAM)
ollama pull mistral

# Verify models are available
ollama list

At this stage, the extension is installed, connected, and ready. The next step moves into precise API configuration — dialing in the endpoint, port, and response behavior to match your exact hardware and workflow demands.

Step 4 Step 2: Configuring API Endpoints and Ports

With the Raycast Ollama extension installed, the next critical step is ensuring Raycast can actually communicate with your local Ollama instance. This requires a solid understanding of how Ollama exposes its API and how to configure the extension to point at the correct endpoint.

Understanding Ollama's Default Network Configuration

By default, Ollama runs a local REST API server on the following address:

Terminal
http://localhost:11434

This is the single most important value you'll configure. Every AI request Raycast makes — whether it's a text summarization, code explanation, or grammar fix — gets routed through this endpoint as an HTTP request to the Ollama inference engine running on your machine.

You can verify Ollama is running and responsive at any time by hitting the health endpoint directly from your terminal:

Terminal
curl http://localhost:11434
# Expected output: Ollama is running

If this returns an error, Ollama isn't running. Start it with:

Terminal
ollama serve

Pro tip: On macOS, if you installed Ollama via the .dmg GUI application, it runs automatically as a menu bar process. If you installed via Homebrew, you may need to start it manually or configure a launchd service.


Configuring the Extension Endpoint in Raycast

Open Raycast (⌘ Space by default), type Extensions, and navigate to the Ollama extension settings. You'll find the following configurable fields:

Setting Default Value Description
Ollama API URL http://localhost:11434 Base URL for the local Ollama server
Request Timeout 60000 ms Max wait time before a request fails
Default Model (user-defined) Model used when no override is specified

Set the Ollama API URL to http://localhost:11434 unless you've deliberately changed Ollama's default port. If you're running Ollama on a different port — for instance, to avoid conflicts with other local services — you can override it at startup:

Terminal
OLLAMA_HOST=0.0.0.0:11435 ollama serve

In that case, update the Raycast extension URL to match:

Terminal
http://localhost:11435

Handling Remote and Network Ollama Instances

One of the underappreciated capabilities of this setup is that Raycast doesn't require Ollama to run locally. If you're running Ollama on a remote machine, a NAS, or a dedicated GPU server on your local network, you can point Raycast at that machine's IP address instead:

Terminal
http://192.168.1.50:11434

Important security consideration: By default, Ollama only binds to localhost. To expose it on a network interface, you must explicitly set:

Terminal
OLLAMA_HOST=0.0.0.0:11434 ollama serve

Never expose this port to the public internet without authentication. Ollama has no built-in auth layer — treat it like an open database port and firewall it accordingly.


Confirming Model Availability

Before proceeding, verify that your target models are pulled and available via the API:

Terminal
curl http://localhost:11434/api/tags | jq '.models[].name'

This returns a list of all locally available models. The Raycast extension will populate its model selector from this list, so any model you intend to use in a custom command must be pulled first using ollama pull <model-name>.

With your endpoint configured and models confirmed, Raycast now has a clear, low-latency channel to your local AI — zero cloud dependencies, zero API costs, and sub-second response times on capable hardware.

Step 5 Creating Custom AI Commands (Summarize, Rewrite, Fix Grammar)

With your Ollama extension configured and communicating with your local models, the real power comes from building purpose-built AI commands that fire instantly from anywhere on your Mac. Raycast's scripting system lets you craft commands tailored to your exact workflow — no copy-pasting into a chat interface, no context switching.

Understanding Raycast AI Extensions vs. Custom Scripts

Raycast offers two pathways for custom AI commands:

  1. Extension-based commands — Built through the Raycast Ollama extension's prompt configuration panel
  2. Script commands — Shell or JavaScript scripts that call the Ollama API directly

For most users, the extension-based approach covers 90% of use cases. For advanced pipelines, script commands give you full control.


Configuring Prompts in the Ollama Extension

Open Raycast, search for "Ollama", and navigate to Custom Commands. Each command requires:

Field Description Example
Name Command identifier in Raycast Summarize Selection
Model Which Ollama model to use llama3.2
System Prompt Persistent instruction to the model You are a concise summarizer.
User Prompt Template Dynamic prompt with {selection} variable Summarize this in 3 bullet points: {selection}
Output Where result appears Clipboard / HUD / Detail View

The Three Essential Commands

1. Summarize

This command distills any selected text — emails, articles, documentation — into digestible bullet points.

System Prompt:

Terminal
You are a precise summarization assistant. Extract only the essential information. Never add opinions or information not present in the source text. Respond only with the summary, no preamble.

User Prompt:

Terminal
Summarize the following text into 3-5 concise bullet points:

{selection}

Set Output to Detail View so longer summaries remain readable without cluttering your clipboard.


2. Rewrite

The rewrite command transforms awkward prose into polished, professional copy. Ideal for Slack messages, documentation, and emails.

System Prompt:

Terminal
You are an expert editor and technical writer. Rewrite the provided text to be clear, concise, and professional. Preserve the original meaning and tone intent. Return only the rewritten text with no explanation.

User Prompt:

Terminal
Rewrite the following to be clearer and more professional:

{selection}

Set Output to Clipboard so you can paste the improved version instantly with ⌘V.


3. Fix Grammar

Surgical grammar correction without altering your voice — critical for developers who write documentation or non-native English speakers.

System Prompt:

Terminal
You are a grammar correction tool. Fix all grammatical errors, punctuation mistakes, and spelling issues in the provided text. Do not change the writing style, tone, or word choices unless grammatically necessary. Return only the corrected text.

User Prompt:

Terminal
Fix the grammar and punctuation in the following text:

{selection}

Advanced: Script Command for Batch Processing

For power users needing programmatic control, call Ollama's REST API directly via a Raycast Script Command:

Terminal
#!/bin/bash
# Required parameters:
# @raycast.schemaVersion 1
# @raycast.title Fix Grammar (Script)
# @raycast.mode silent

SELECTION=$(pbpaste)

RESPONSE=$(curl -s http://localhost:11434/api/generate \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"llama3.2\",
    \"prompt\": \"Fix grammar only, return corrected text: ${SELECTION}\",
    \"stream\": false
  }" | python3 -c "import sys, json; print(json.load(sys.stdin)['response'])")

echo "$RESPONSE" | pbcopy

Save this as fix-grammar.sh, make it executable with chmod +x fix-grammar.sh, and drop it into ~/.config/raycast/scripts/.


Model Selection Strategy

Not every task needs your most powerful model. Match model size to command complexity:

Command Recommended Model Why
Fix Grammar phi3 or gemma2:2b Fast, lightweight, simple task
Rewrite llama3.2 Balanced quality and speed
Summarize llama3.2 or mistral Requires comprehension depth

Pro tip: Smaller models respond in under a second on Apple Silicon — for grammar fixes, that speed difference is transformative when you're using the command dozens of times per day.

Step 6 Optimizing Hotkeys for Lightning-Fast Access

The difference between a good AI workflow and a great one comes down to friction. Every extra click, every context switch, every moment spent reaching for a menu is cognitive overhead that breaks your flow state. Hotkeys eliminate that friction entirely — transforming AI assistance from a deliberate tool into an invisible extension of your thinking.

The Raycast Hotkey Architecture

Raycast supports two distinct layers of keyboard shortcut assignment:

Layer Scope Best Used For
Global Hotkey System-wide, works in any app Your single most-used AI command
Extension Hotkey Triggers specific extension commands Frequently used but secondary commands
Alias Opens Raycast + pre-fills command Quick access without memorizing full names
Quicklink Raycast shortcut to a specific prompt Templated, repeatable AI tasks

Assigning Hotkeys to Your AI Commands

Navigate to Raycast Preferences → Extensions → Ollama AI and locate each custom command you built in the previous section. Click the hotkey field next to any command and press your desired key combination.

Recommended hotkey scheme for AI commands:

Terminal
⌥ + Space        → Open Raycast (main launcher)
⌃ + ⌥ + S        → AI Summarize (selected text)
⌃ + ⌥ + R        → AI Rewrite (selected text)  
⌃ + ⌥ + G        → Fix Grammar (selected text)
⌃ + ⌥ + O        → Open Ollama Chat (freeform)

Pro tip: Use Control (⌃) + Option (⌥) as your modifier prefix for all AI commands. This combination is rarely claimed by macOS system shortcuts or other applications, giving you a clean, conflict-free namespace.

Avoiding Hotkey Conflicts

Before finalizing any shortcut, verify it isn't already claimed:

Terminal
# Check system-level shortcuts via defaults
defaults read com.apple.symbolichotkeys AppleSymbolicHotKeys

You can also navigate to System Settings → Keyboard → Keyboard Shortcuts and scan through each category. Common conflict zones include:

  • Mission Control owns most ⌃ + Arrow combinations
  • Spotlight defaults to ⌘ + Space
  • Screenshot tools occupy several ⌘ + Shift + number slots
  • Third-party apps like Alfred, 1Password, or Magnet may claim -based shortcuts

Supercharging Text Selection Workflows

The real power of system-wide AI hotkeys emerges when combined with text selection triggers. Configure your Ollama commands to operate on the currently selected text in any application:

  1. Select text in Safari, Notion, Xcode, Slack — anywhere
  2. Press ⌃ + ⌥ + S
  3. Ollama processes the selection and returns a summary inline, without switching apps

This works because Raycast reads the system clipboard and active selection context. To ensure reliability, enable "Read selected text from active app" in your extension's configuration panel.

Creating a Muscle Memory Map

Write your hotkeys down and stick to them for 30 days. Consistency is what converts conscious shortcuts into subconscious muscle memory. Consider this progression:

Terminal
Week 1:  Use hotkeys consciously, referring to your cheat sheet
Week 2:  Fingers begin finding shortcuts without looking
Week 3:  Hotkeys feel as natural as ⌘+C / ⌘+V
Week 4:  You forget AI assistance is a "tool" — it's just thinking

The goal is zero-latency AI access — where the gap between intention and execution disappears entirely. With the right hotkey architecture in place, Ollama stops being something you open and becomes something you reach for.