Open Source

A.R.E.S

Agentic Retrieval Enhanced Server

A production-grade agentic chatbot server built in Rust with multi-provider LLM support, tool calling, RAG, MCP integration, and advanced research capabilities.

Quick Start
# Clone and setup
git clone https://github.com/dirmacs/ares
cd ares
cp .env.example .env
 
# Start Ollama (local LLM)
ollama pull granite4:tiny-h
 
# Build and run
cargo build
cargo run
 
# Server runs on http://localhost:3000
Scroll

What is A.R.E.S?

A.R.E.S is a high-performance AI agent runtime built in Rust. It serves as the operating system for AI agents , providing the infrastructure to run agents from multiple providers on a single platform.

Multi-Provider

Run agents from Ollama, OpenAI, or LlamaCpp on a single platform. No vendor lock-in.

Local-First

Runs entirely locally with Ollama and SQLite by default. No external dependencies required.

Open Source

MIT licensed and available on GitHub. Free to use, modify, and self-host forever.

Capabilities

Core Features

Everything you need to build and deploy production-grade AI agents.

Multi-Agent Orchestration

Coordinate multiple AI agents with configurable workflows and specialized routing.

Tool Calling

Type-safe function calling with automatic schema generation and per-agent tool filtering.

Enterprise Security

JWT-based authentication with Argon2 password hashing and API key support.

Streaming Responses

Real-time token streaming from all supported LLM providers.

RAG & Knowledge Bases

Pluggable knowledge bases with semantic search via SQLite and Qdrant.

Deep Research

Multi-step research capabilities with parallel subagents and built-in web search.

Hot Configuration

TOML-based declarative configuration with automatic hot-reloading.

Workflow Engine

Declarative workflow execution with agent routing and circular reference detection.

Flexibility

Multi-Provider LLM Support

A.R.E.S supports multiple LLM providers out of the box. Switch between providers without changing your code.

Supported Providers

🦙

Ollama

Local LLM inference

Default
🤖

OpenAI

GPT-4 and compatible APIs

LlamaCpp

Direct GGUF model loading

GPU Acceleration

🟢

CUDA

NVIDIA GPU acceleration

🍎

Metal

Apple Silicon (macOS)

🔷

Vulkan

Cross-platform GPU

LlamaCpp integration supports direct GGUF model loading with optional GPU acceleration via CUDA, Metal, or Vulkan backends.

Configuration

Declarative TOML Config

Configure providers, models, agents, tools, and workflows entirely through TOML. Changes are automatically detected and applied without restarting the server.

  • Hot-reloading within 500ms
  • Circular reference detection
  • Environment variable support
  • Unused config warnings
ares.toml
# LLM Providers
[providers.ollama-local]
type = "ollama"
base_url = "http://localhost:11434"
default_model = "granite4:tiny-h"
 
# Models with parameters
[models.fast]
provider = "ollama-local"
temperature = 0.7
max_tokens = 256
 
# Agents with tool filtering
[agents.research]
model = "fast"
tools = ["web_search", "calculator"]
Storage

Database Backends

Choose the right database for your needs. Local SQLite by default, with optional cloud and vector database support.

SQLite (libsql)

Local-first, zero configuration

Default

Turso

Remote edge database

Qdrant

Vector database for semantic search

Tools

Built-in Tools

Ready-to-use tools with per-agent filtering.

🧮

Calculator

Basic arithmetic operations with type-safe execution

🔍

Web Search

Built-in search via Daedra - no API keys required

Web search powered by Daedra , no API keys required.

API

RESTful API

Interactive Swagger UI documentation included. JWT-based authentication with refresh tokens.

Chat Endpoint

/api/chat
curl -X POST http://localhost:3000/api/chat \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"message": "What products do we have?",
"agent_type": "product"
}'

Deep Research

/api/research
curl -X POST http://localhost:3000/api/research \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"query": "Analyze market trends",
"depth": 3,
"max_iterations": 5
}'

Workflow Execution

/api/workflows
curl -X POST http://localhost:3000/api/workflows/default \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"query": "Q4 product sales figures?"
}'

Workflow Response

Response
{
"final_response": "Based on Q4 data...",
"steps_executed": 3,
"agents_used": ["router", "sales", "product"],
"reasoning_path": [...]
}
System Design

Architecture

A modular, thread-safe architecture designed for production workloads.

AresConfigManager

ProviderRegistry

AgentRegistry

ToolRegistry

WorkflowEngine

Configuration Layer

Thread-safe config management with hot-reloading. TOML-based declarative configuration.

Agent Layer

ConfigurableAgents with per-agent tool filtering. Multi-agent orchestration with specialized routing.

Execution Layer

Workflow engine executes declarative workflows. Axum-based API with /api/chat, /api/research, /api/workflows.

Benefits

Why A.R.E.S?

  • Built in Rust for extreme speed and memory safety
  • Production-ready with comprehensive error handling
  • Local-first development with Ollama and SQLite by default
  • OpenAPI automatic documentation generation
  • Comprehensive unit and integration tests
  • MIT licensed - open source forever

Requirements

🦀Rust
1.91+
🦙Ollama
Recommended
just
Command runner

Server runs on port 3000 by default. Swagger UI available at /swagger-ui/

Get Started

MIT LicensedOpen Source
Quality Assurance

Comprehensive Testing

A.R.E.S includes unit tests, integration tests, and end-to-end API tests.

🧪

Unit & Integration

Mocked tests that run without external services.

cargo test
🦙

Live Ollama Tests

Tests against a real Ollama instance for validation.

just test-ignored
🌐

API Tests (Hurl)

End-to-end API testing using Hurl test runner.

just hurl
ARES Logo

Get Started with A.R.E.S

Open source and MIT licensed. Clone the repository, configure your agents, and deploy production-grade AI infrastructure.