laptop_mac macOS Sonoma
Intermediate
schedule 8 min read
by Alex Rivera • May 14, 2024
Step 1 Unleashing AI System-Wide on macOS
The modern macOS power user operates in a fragmented AI landscape — browser tabs pinned to ChatGPT, separate apps for writing assistance, and constant context-switching that destroys flow state. What if your AI wasn't siloed in a browser window but woven directly into the fabric of your operating system, available the instant you need it, responding in milliseconds, with zero data leaving your machine?
That's exactly what happens when you connect Ollama to Raycast. The combination is genuinely transformative for how you interact with your Mac.
Why This Architecture Is Different
Most AI integrations follow the same pattern: open an app, type a prompt, wait for a cloud response, copy the output. This workflow introduces cognitive overhead at every step. Raycast + Ollama breaks this pattern entirely by positioning AI as a first-class system primitive.
| Approach |
Latency |
Privacy |
Offline Support |
Context Awareness |
| Browser-based ChatGPT |
High (network round-trip) |
❌ Data sent to OpenAI |
❌ Requires internet |
Limited |
| Native AI apps |
Medium |
Varies |
Sometimes |
Minimal |
| Raycast + Ollama |
Ultra-low (localhost) |
✅ 100% local |
✅ Fully offline |
Deep (system-wide) |
What Ollama Brings to the Equation
Ollama is a lightweight inference server that runs large language models locally on Apple Silicon. It exposes a clean REST API on http://localhost:11434, making it trivially easy for other tools to consume. Models like Llama 3, Mistral, Phi-3, and Gemma 2 run with impressive speed on M-series chips, often matching or exceeding cloud model response times for typical tasks.
Terminal
# Verify Ollama is running and listening
curl http://localhost:11434/api/tags
# Expected output: a JSON list of your locally installed models
Once Ollama is running as a background service, it becomes a persistent AI backbone that any properly configured application can query — including Raycast.
What Raycast Brings to the Equation
Raycast is a system-wide command launcher that has effectively replaced Spotlight for hundreds of thousands of Mac users. Its extension ecosystem allows it to integrate with virtually anything, and its AI command framework enables you to pipe selected text, clipboard contents, or free-form prompts directly into any LLM endpoint.
The critical insight here: Raycast commands are available from anywhere in macOS. Whether you're in Xcode reviewing a function, in Notion drafting a document, in Slack composing a message, or in Terminal debugging a script — a single hotkey invocation brings AI to whatever you're working on, without switching apps.
The Power of Composability
What makes this setup genuinely powerful isn't any single feature — it's composability. You can:
- Select code in any editor → invoke a Raycast AI command → get an explanation injected into your clipboard
- Highlight a dense paragraph in Safari → summarize it in plain language without leaving the page
- Grab an error message from Terminal → run it through a debugging prompt → paste the fix back immediately
This is the difference between AI as a tool you visit and AI as a capability you carry. The sections that follow will walk you through assembling this system from scratch, from installing the necessary Raycast extension to configuring hotkeys that make the entire workflow feel like a native OS feature.
Note: Everything in this guide runs entirely on-device. No API keys, no subscriptions, no telemetry. Your prompts and responses never leave your Mac.
Step 2 Prerequisites: Raycast Setup
Before diving into the Ollama integration, you need a properly configured Raycast environment. Skipping this foundation will cause friction later—so let's get this right from the start.
What You'll Need
| Requirement |
Version |
Notes |
| Raycast |
1.50.0+ |
Pro plan required for AI features |
| macOS |
12 Monterey+ |
Ventura or Sonoma strongly recommended |
| Ollama |
0.1.20+ |
Must be running as a local service |
| RAM |
8GB minimum |
16GB+ recommended for larger models |
Installing Raycast
If you haven't already installed Raycast, it's a straightforward process. Download the latest stable release directly from raycast.com or install via Homebrew:
Terminal
brew install --cask raycast
Once installed, launch Raycast and complete the initial onboarding. Replace Spotlight immediately—this is non-negotiable for the workflow we're building. Navigate to:
Terminal
System Settings → Keyboard → Keyboard Shortcuts → Spotlight
Uncheck Show Spotlight search from ⌘Space, then assign ⌘Space inside Raycast's preferences under General → Raycast Hotkey.
Verifying Your Raycast Version
The Ollama extension requires Raycast's extension API to support custom model endpoints. Open the Raycast preferences and confirm your build:
Terminal
Raycast → About Raycast → Build Number
Alternatively, run this quick check from your terminal:
Terminal
defaults read com.raycast.macos CFBundleShortVersionString
If you're behind on versions, the built-in updater will handle this:
Terminal
Raycast → Check for Updates
Enabling Extensions in Raycast
By default, Raycast's extension store is accessible, but you want to ensure the Extensions tab is unlocked and the store is reachable. Open Raycast preferences with ⌘, and confirm you can navigate to the Extensions tab without error.
Critical setting to enable before proceeding:
Navigate to Raycast Preferences → Extensions and ensure "Allow Extension Installation from Store" is toggled on. Without this, the Ollama extension installation in the next step will silently fail.
Confirming Ollama Is Running
Raycast needs a live Ollama instance to communicate with. Before configuring anything inside Raycast, verify Ollama is active and responsive:
Terminal
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Expected response (example)
{
"models": [
{
"name": "llama3:latest",
"modified_at": "2024-01-15T10:30:00Z",
"size": 4661211584
}
]
}
If the curl command times out or returns a connection error, start Ollama manually:
Pro tip: Add Ollama to your macOS login items so it starts automatically. Navigate to System Settings → General → Login Items and add the Ollama application. This ensures Raycast always has a model backend available the moment you wake your machine.
Network Permissions
macOS will prompt for network access permissions the first time Raycast attempts to reach your local Ollama instance. Click "Allow" without hesitation—this is localhost communication, not external network access. If you accidentally denied the prompt, reset it via:
Terminal
tccutil reset All com.raycast.macos
With these prerequisites locked in, your environment is primed and ready for the extension installation.
Step 3 Step 1: Installing the Raycast Ollama Extension
With Ollama running locally and Raycast installed, the bridge between your system-wide launcher and your local AI models is a single extension. This section walks you through the precise installation process, ensuring zero ambiguity at every step.
Finding the Extension in the Raycast Store
Raycast maintains a curated extension marketplace accessible directly from the app itself. Here's how to navigate to it:
- Open Raycast with your configured hotkey (default:
⌥ Space)
- Type "Store" and select Raycast Store
- In the search bar, type
Ollama
- Locate the extension titled "Ollama AI" — authored by the community and vetted by the Raycast team
Alternatively, you can install it directly from the web:
Terminal
https://www.raycast.com/massimiliano_pasquini/raycast-ollama
Click "Install Extension" on the web page, and Raycast will deep-link directly into the installation prompt on your machine.
Installing via Raycast CLI (Power User Method)
If you prefer terminal-based workflows, Raycast supports extension management through its CLI toolchain. First, ensure you have the Raycast CLI installed:
Terminal
# Install Raycast CLI via npm
npm install -g @raycast/api
# Verify installation
raycast --version
Note: The CLI method is primarily intended for extension development, not end-user installation. For production use, the Store UI method is recommended.
Extension Installation Walkthrough
Once you've clicked Install from either the Store UI or the web portal, you'll see the following permission prompt in Raycast:
| Permission Requested |
Reason |
| Network Access |
Communicates with Ollama's local HTTP API |
| Clipboard Read/Write |
Enables text transformation commands |
| System Services |
Allows AI responses to inject into active apps |
Accept all permissions — none of these reach the internet. Every request routes to localhost only, meaning your data never leaves your machine.
Verifying the Installation
After installation, confirm the extension is live:
- Open Raycast (
⌥ Space)
- Type
Ollama — you should immediately see a cluster of commands appear:
- Ollama: Chat
- Ollama: Ask
- Ollama: Summarize
- Ollama: Fix Grammar
- Select
Ollama: Chat and hit ↵
If Ollama is running (ollama serve in your terminal), you'll drop directly into an interactive chat session. If you see a connection error, verify the daemon is active:
Terminal
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Expected output (truncated):
# {"models":[{"name":"llama3.2:latest",...}]}
# If not running, start it:
ollama serve
Pulling Your First Model (If Not Done Already)
The extension requires at least one model to be pulled locally before commands execute. A fast, capable default for general use:
Terminal
# Lightweight and fast — ideal for system-wide commands
ollama pull llama3.2
# Higher capability for complex tasks (requires more RAM)
ollama pull mistral
# Verify models are available
ollama list
At this stage, the extension is installed, connected, and ready. The next step moves into precise API configuration — dialing in the endpoint, port, and response behavior to match your exact hardware and workflow demands.
Step 4 Step 2: Configuring API Endpoints and Ports
With the Raycast Ollama extension installed, the next critical step is ensuring Raycast can actually communicate with your local Ollama instance. This requires a solid understanding of how Ollama exposes its API and how to configure the extension to point at the correct endpoint.
Understanding Ollama's Default Network Configuration
By default, Ollama runs a local REST API server on the following address:
This is the single most important value you'll configure. Every AI request Raycast makes — whether it's a text summarization, code explanation, or grammar fix — gets routed through this endpoint as an HTTP request to the Ollama inference engine running on your machine.
You can verify Ollama is running and responsive at any time by hitting the health endpoint directly from your terminal:
Terminal
curl http://localhost:11434
# Expected output: Ollama is running
If this returns an error, Ollama isn't running. Start it with:
Pro tip: On macOS, if you installed Ollama via the .dmg GUI application, it runs automatically as a menu bar process. If you installed via Homebrew, you may need to start it manually or configure a launchd service.
Configuring the Extension Endpoint in Raycast
Open Raycast (⌘ Space by default), type Extensions, and navigate to the Ollama extension settings. You'll find the following configurable fields:
| Setting |
Default Value |
Description |
| Ollama API URL |
http://localhost:11434 |
Base URL for the local Ollama server |
| Request Timeout |
60000 ms |
Max wait time before a request fails |
| Default Model |
(user-defined) |
Model used when no override is specified |
Set the Ollama API URL to http://localhost:11434 unless you've deliberately changed Ollama's default port. If you're running Ollama on a different port — for instance, to avoid conflicts with other local services — you can override it at startup:
Terminal
OLLAMA_HOST=0.0.0.0:11435 ollama serve
In that case, update the Raycast extension URL to match:
Handling Remote and Network Ollama Instances
One of the underappreciated capabilities of this setup is that Raycast doesn't require Ollama to run locally. If you're running Ollama on a remote machine, a NAS, or a dedicated GPU server on your local network, you can point Raycast at that machine's IP address instead:
Terminal
http://192.168.1.50:11434
Important security consideration: By default, Ollama only binds to localhost. To expose it on a network interface, you must explicitly set:
Terminal
OLLAMA_HOST=0.0.0.0:11434 ollama serve
Never expose this port to the public internet without authentication. Ollama has no built-in auth layer — treat it like an open database port and firewall it accordingly.
Confirming Model Availability
Before proceeding, verify that your target models are pulled and available via the API:
Terminal
curl http://localhost:11434/api/tags | jq '.models[].name'
This returns a list of all locally available models. The Raycast extension will populate its model selector from this list, so any model you intend to use in a custom command must be pulled first using ollama pull <model-name>.
With your endpoint configured and models confirmed, Raycast now has a clear, low-latency channel to your local AI — zero cloud dependencies, zero API costs, and sub-second response times on capable hardware.
Step 5 Creating Custom AI Commands (Summarize, Rewrite, Fix Grammar)
With your Ollama extension configured and communicating with your local models, the real power comes from building purpose-built AI commands that fire instantly from anywhere on your Mac. Raycast's scripting system lets you craft commands tailored to your exact workflow — no copy-pasting into a chat interface, no context switching.
Understanding Raycast AI Extensions vs. Custom Scripts
Raycast offers two pathways for custom AI commands:
- Extension-based commands — Built through the Raycast Ollama extension's prompt configuration panel
- Script commands — Shell or JavaScript scripts that call the Ollama API directly
For most users, the extension-based approach covers 90% of use cases. For advanced pipelines, script commands give you full control.
Configuring Prompts in the Ollama Extension
Open Raycast, search for "Ollama", and navigate to Custom Commands. Each command requires:
| Field |
Description |
Example |
| Name |
Command identifier in Raycast |
Summarize Selection |
| Model |
Which Ollama model to use |
llama3.2 |
| System Prompt |
Persistent instruction to the model |
You are a concise summarizer. |
| User Prompt Template |
Dynamic prompt with {selection} variable |
Summarize this in 3 bullet points: {selection} |
| Output |
Where result appears |
Clipboard / HUD / Detail View |
The Three Essential Commands
1. Summarize
This command distills any selected text — emails, articles, documentation — into digestible bullet points.
System Prompt:
Terminal
You are a precise summarization assistant. Extract only the essential information. Never add opinions or information not present in the source text. Respond only with the summary, no preamble.
User Prompt:
Terminal
Summarize the following text into 3-5 concise bullet points:
{selection}
Set Output to Detail View so longer summaries remain readable without cluttering your clipboard.
2. Rewrite
The rewrite command transforms awkward prose into polished, professional copy. Ideal for Slack messages, documentation, and emails.
System Prompt:
Terminal
You are an expert editor and technical writer. Rewrite the provided text to be clear, concise, and professional. Preserve the original meaning and tone intent. Return only the rewritten text with no explanation.
User Prompt:
Terminal
Rewrite the following to be clearer and more professional:
{selection}
Set Output to Clipboard so you can paste the improved version instantly with ⌘V.
3. Fix Grammar
Surgical grammar correction without altering your voice — critical for developers who write documentation or non-native English speakers.
System Prompt:
Terminal
You are a grammar correction tool. Fix all grammatical errors, punctuation mistakes, and spelling issues in the provided text. Do not change the writing style, tone, or word choices unless grammatically necessary. Return only the corrected text.
User Prompt:
Terminal
Fix the grammar and punctuation in the following text:
{selection}
Advanced: Script Command for Batch Processing
For power users needing programmatic control, call Ollama's REST API directly via a Raycast Script Command:
Terminal
#!/bin/bash
# Required parameters:
# @raycast.schemaVersion 1
# @raycast.title Fix Grammar (Script)
# @raycast.mode silent
SELECTION=$(pbpaste)
RESPONSE=$(curl -s http://localhost:11434/api/generate \
-H "Content-Type: application/json" \
-d "{
\"model\": \"llama3.2\",
\"prompt\": \"Fix grammar only, return corrected text: ${SELECTION}\",
\"stream\": false
}" | python3 -c "import sys, json; print(json.load(sys.stdin)['response'])")
echo "$RESPONSE" | pbcopy
Save this as fix-grammar.sh, make it executable with chmod +x fix-grammar.sh, and drop it into ~/.config/raycast/scripts/.
Model Selection Strategy
Not every task needs your most powerful model. Match model size to command complexity:
| Command |
Recommended Model |
Why |
| Fix Grammar |
phi3 or gemma2:2b |
Fast, lightweight, simple task |
| Rewrite |
llama3.2 |
Balanced quality and speed |
| Summarize |
llama3.2 or mistral |
Requires comprehension depth |
Pro tip: Smaller models respond in under a second on Apple Silicon — for grammar fixes, that speed difference is transformative when you're using the command dozens of times per day.
Step 6 Optimizing Hotkeys for Lightning-Fast Access
The difference between a good AI workflow and a great one comes down to friction. Every extra click, every context switch, every moment spent reaching for a menu is cognitive overhead that breaks your flow state. Hotkeys eliminate that friction entirely — transforming AI assistance from a deliberate tool into an invisible extension of your thinking.
The Raycast Hotkey Architecture
Raycast supports two distinct layers of keyboard shortcut assignment:
| Layer |
Scope |
Best Used For |
| Global Hotkey |
System-wide, works in any app |
Your single most-used AI command |
| Extension Hotkey |
Triggers specific extension commands |
Frequently used but secondary commands |
| Alias |
Opens Raycast + pre-fills command |
Quick access without memorizing full names |
| Quicklink |
Raycast shortcut to a specific prompt |
Templated, repeatable AI tasks |
Assigning Hotkeys to Your AI Commands
Navigate to Raycast Preferences → Extensions → Ollama AI and locate each custom command you built in the previous section. Click the hotkey field next to any command and press your desired key combination.
Recommended hotkey scheme for AI commands:
Terminal
⌥ + Space → Open Raycast (main launcher)
⌃ + ⌥ + S → AI Summarize (selected text)
⌃ + ⌥ + R → AI Rewrite (selected text)
⌃ + ⌥ + G → Fix Grammar (selected text)
⌃ + ⌥ + O → Open Ollama Chat (freeform)
Pro tip: Use Control (⌃) + Option (⌥) as your modifier prefix for all AI commands. This combination is rarely claimed by macOS system shortcuts or other applications, giving you a clean, conflict-free namespace.
Avoiding Hotkey Conflicts
Before finalizing any shortcut, verify it isn't already claimed:
Terminal
# Check system-level shortcuts via defaults
defaults read com.apple.symbolichotkeys AppleSymbolicHotKeys
You can also navigate to System Settings → Keyboard → Keyboard Shortcuts and scan through each category. Common conflict zones include:
- Mission Control owns most
⌃ + Arrow combinations
- Spotlight defaults to
⌘ + Space
- Screenshot tools occupy several
⌘ + Shift + number slots
- Third-party apps like Alfred, 1Password, or Magnet may claim
⌥-based shortcuts
Supercharging Text Selection Workflows
The real power of system-wide AI hotkeys emerges when combined with text selection triggers. Configure your Ollama commands to operate on the currently selected text in any application:
- Select text in Safari, Notion, Xcode, Slack — anywhere
- Press
⌃ + ⌥ + S
- Ollama processes the selection and returns a summary inline, without switching apps
This works because Raycast reads the system clipboard and active selection context. To ensure reliability, enable "Read selected text from active app" in your extension's configuration panel.
Creating a Muscle Memory Map
Write your hotkeys down and stick to them for 30 days. Consistency is what converts conscious shortcuts into subconscious muscle memory. Consider this progression:
Terminal
Week 1: Use hotkeys consciously, referring to your cheat sheet
Week 2: Fingers begin finding shortcuts without looking
Week 3: Hotkeys feel as natural as ⌘+C / ⌘+V
Week 4: You forget AI assistance is a "tool" — it's just thinking
The goal is zero-latency AI access — where the gap between intention and execution disappears entirely. With the right hotkey architecture in place, Ollama stops being something you open and becomes something you reach for.