API Integration Rules

Intermediate

Standards for integrating Ollama's REST API into applications — health checks, error handling, timeout configuration, streaming patterns, and request parameter validation.

File Patterns

**/*.ts**/*.js**/*.py

This rule applies to files matching the patterns above.

Rule Content

rule-content.md

# API Integration Rules

## Rule
All Ollama API integrations MUST implement health checks, proper error handling, configurable timeouts, and streaming for interactive use cases.

## Health Check (Required)
```typescript
// ALWAYS check server health before first request
async function checkOllamaHealth(): Promise<boolean> {
  try {
    const response = await fetch('http://localhost:11434/api/version', {
      signal: AbortSignal.timeout(5000),
    });
    return response.ok;
  } catch {
    return false;
  }
}
```

## Error Handling (Required)
```typescript
// Good — comprehensive error handling
async function ollamaChat(model: string, messages: Message[]) {
  const healthy = await checkOllamaHealth();
  if (!healthy) {
    throw new OllamaError('Ollama server not running. Start with: ollama serve');
  }

  try {
    const response = await fetch('http://localhost:11434/api/chat', {
      method: 'POST',
      signal: AbortSignal.timeout(60000), // 60s for model loading
      body: JSON.stringify({ model, messages, stream: false }),
    });

    if (!response.ok) {
      const error = await response.json();
      throw new OllamaError(`Ollama error: ${error.error}`);
    }

    return await response.json();
  } catch (error) {
    if (error instanceof DOMException && error.name === 'TimeoutError') {
      throw new OllamaError('Request timed out — model may be loading');
    }
    throw error;
  }
}

// Bad — no error handling
async function ollamaChatBad(prompt: string) {
  const res = await fetch('http://localhost:11434/api/chat', {
    method: 'POST',
    body: JSON.stringify({ model: 'llama3.1', messages: [{ role: 'user', content: prompt }] }),
  });
  return res.json(); // No error handling, no timeout, no health check
}
```

## Timeout Configuration
| Operation | Recommended Timeout |
|-----------|-------------------|
| Health check | 5s |
| First request (model loading) | 60s |
| Subsequent requests | 30s |
| Streaming (per chunk) | 10s |
| Embedding generation | 15s |

## Streaming Rules
- ALWAYS use streaming (stream: true) for interactive UIs
- Parse each line as independent JSON
- Handle partial JSON chunks gracefully
- Implement a cancel mechanism for long-running generations

## Base URL Configuration
```typescript
// Good — configurable base URL
const OLLAMA_BASE_URL = process.env.OLLAMA_HOST || 'http://localhost:11434';

// Bad — hardcoded URL
const url = 'http://localhost:11434'; // Can't change without code edit
```

## Anti-Patterns
- No health check before requests (silent failures)
- No timeout (requests hang forever if server is down)
- Hardcoded localhost:11434 (can't use remote Ollama)
- Not handling model loading delay on first request
- Using stream: false for interactive applications

FAQ

Discussion

Loading comments...

# API Integration Rules ## Rule All Ollama API integrations MUST implement health checks, proper error handling, configurable timeouts, and streaming for interactive use cases. ## Health Check (Required) ```typescript // ALWAYS check server health before first request async function checkOllamaHealth(): Promise<boolean> { try { const response = await fetch('http://localhost:11434/api/version', { signal: AbortSignal.timeout(5000), }); return response.ok; } catch { return false; } } ``` ## Error Handling (Required) ```typescript // Good — comprehensive error handling async function ollamaChat(model: string, messages: Message[]) { const healthy = await checkOllamaHealth(); if (!healthy) { throw new OllamaError('Ollama server not running. Start with: ollama serve'); } try { const response = await fetch('http://localhost:11434/api/chat', { method: 'POST', signal: AbortSignal.timeout(60000), // 60s for model loading body: JSON.stringify({ model, messages, stream: false }), }); if (!response.ok) { const error = await response.json(); throw new OllamaError(`Ollama error: ${error.error}`); } return await response.json(); } catch (error) { if (error instanceof DOMException && error.name === 'TimeoutError') { throw new OllamaError('Request timed out — model may be loading'); } throw error; } } // Bad — no error handling async function ollamaChatBad(prompt: string) { const res = await fetch('http://localhost:11434/api/chat', { method: 'POST', body: JSON.stringify({ model: 'llama3.1', messages: [{ role: 'user', content: prompt }] }), }); return res.json(); // No error handling, no timeout, no health check } ``` ## Timeout Configuration | Operation | Recommended Timeout | |-----------|-------------------| | Health check | 5s | | First request (model loading) | 60s | | Subsequent requests | 30s | | Streaming (per chunk) | 10s | | Embedding generation | 15s | ## Streaming Rules - ALWAYS use streaming (stream: true) for interactive UIs - Parse each line as independent JSON - Handle partial JSON chunks gracefully - Implement a cancel mechanism for long-running generations ## Base URL Configuration ```typescript // Good — configurable base URL const OLLAMA_BASE_URL = process.env.OLLAMA_HOST || 'http://localhost:11434'; // Bad — hardcoded URL const url = 'http://localhost:11434'; // Can't change without code edit ``` ## Anti-Patterns - No health check before requests (silent failures) - No timeout (requests hang forever if server is down) - Hardcoded localhost:11434 (can't use remote Ollama) - Not handling model loading delay on first request - Using stream: false for interactive applications