Open Source

A.R.E.S

Agentic Retrieval Enhanced Server

A production-grade agentic chatbot server built in Rust with multi-provider LLM support, tool calling, RAG, MCP integration, and advanced research capabilities.

Quick Start

# Clone and setup

git clone https://github.com/dirmacs/ares

cd ares

cp .env.example .env

# Start Ollama (local LLM)

ollama pull ministral-3:3b

# Build and run

cargo build

cargo run

# Server runs on http://localhost:3000

Scroll

What is A.R.E.S?

A.R.E.S is a high-performance AI agent runtime built in Rust. It serves as the operating system for AI agents , providing the infrastructure to run agents from multiple providers on a single platform.

Multi-Provider

Run agents from Ollama, OpenAI, or LlamaCpp on a single platform. No vendor lock-in.

Local-First

Runs entirely locally with Ollama and SQLite by default. No external dependencies required.

Open Source

MIT licensed and available on GitHub. Free to use, modify, and self-host forever.

Capabilities

Core Features

Everything you need to build and deploy production-grade AI agents.

Multi-Agent Orchestration

Coordinate multiple AI agents with configurable workflows and specialized routing.

Tool Calling

Type-safe function calling with automatic schema generation and per-agent tool filtering.

Enterprise Security

JWT-based authentication with Argon2 password hashing and API key support.

Streaming Responses

Real-time token streaming from all supported LLM providers.

RAG & Knowledge Bases

Pluggable knowledge bases with semantic search via SQLite and Qdrant.

Deep Research

Multi-step research capabilities with built-in web search and document retrieval.

Hot Configuration

TOML-based declarative configuration with automatic hot-reloading.

Workflow Engine

Declarative workflow execution with agent routing and circular reference detection.

Flexibility

Multi-Provider LLM Support

A.R.E.S supports multiple LLM providers out of the box. Switch between providers without changing your code.

Supported Providers

🦙

Ollama

Local LLM inference

Default

🤖

OpenAI

GPT-4 and compatible APIs

⚡

LlamaCpp

Direct GGUF model loading

GPU Acceleration

🟢

CUDA

NVIDIA GPU acceleration

🍎

Metal

Apple Silicon (macOS)

🔷

Vulkan

Cross-platform GPU

LlamaCpp integration supports direct GGUF model loading with optional GPU acceleration via CUDA, Metal, or Vulkan backends.

Configuration

Declarative TOML Config

Configure providers, models, agents, tools, and workflows entirely through TOML. Changes are automatically detected and applied without restarting the server.

Hot-reloading within 500ms
Circular reference detection
Environment variable support
Unused config warnings

ares.toml

# LLM Providers

[providers.ollama-local]

type = "ollama"

base_url = "http://localhost:11434"

default_model = "ministral-3:3b"

# Models with parameters

[models.fast]

provider = "ollama-local"

temperature = 0.7

max_tokens = 256

# Agents with tool filtering

[agents.research]

model = "fast"

tools = ["web_search", "calculator"]

Storage

Database Backends

Choose the right database for your needs. Local SQLite by default, with optional cloud and vector database support.

SQLite (libsql)

Local-first, zero configuration

Default

Turso

Remote edge database

Qdrant

Vector database for semantic search

Tools

Built-in Tools

Ready-to-use tools with per-agent filtering.

🧮

Calculator

Basic arithmetic operations with type-safe execution

🔍

Web Search

Built-in search via Daedra - no API keys required

Web search powered by Daedra , no API keys required.

API

RESTful API

Interactive Swagger UI documentation included. JWT-based authentication with refresh tokens.

Chat Endpoint

/api/chat

curl -X POST http://localhost:3000/api/chat \

-H "Authorization: Bearer <token>" \

-H "Content-Type: application/json" \

-d '{

"message": "What products do we have?",

"agent_type": "product"

Deep Research

/api/research

curl -X POST http://localhost:3000/api/research \

-H "Authorization: Bearer <token>" \

-H "Content-Type: application/json" \

-d '{

"query": "Analyze market trends",

"depth": 3,

"max_iterations": 5

Workflow Execution

/api/workflows

curl -X POST http://localhost:3000/api/workflows/default \

-H "Authorization: Bearer <token>" \

-H "Content-Type: application/json" \

-d '{

"query": "Q4 product sales figures?"

Workflow Response

Response

{

"final_response": "Based on Q4 data...",

"steps_executed": 3,

"agents_used": ["router", "sales", "product"],

"reasoning_path": [...]

}

System Design

Architecture

A modular, thread-safe architecture designed for production workloads.

AresConfigManager

ProviderRegistry

AgentRegistry

ToolRegistry

WorkflowEngine

Configuration Layer

Thread-safe config management with hot-reloading. TOML-based declarative configuration.

Agent Layer

ConfigurableAgents with per-agent tool filtering. Multi-agent orchestration with specialized routing.

Execution Layer

Workflow engine executes declarative workflows. Axum-based API with /api/chat, /api/research, /api/workflows.

Benefits

Why A.R.E.S?

Built in Rust for extreme speed and memory safety
Production-ready with comprehensive error handling
Local-first development with Ollama and SQLite by default
OpenAPI automatic documentation generation
Comprehensive unit and integration tests
MIT licensed - open source forever

Requirements

🦀Rust

1.91+

🦙Ollama

Recommended

⚡just

Command runner

Server runs on port 3000 by default. Swagger UI available at /swagger-ui/

Get Started

View Source Code

Read Documentation

MIT LicensedOpen Source

Quality Assurance

Comprehensive Testing

A.R.E.S includes unit tests, integration tests, and end-to-end API tests.

🧪

Unit & Integration

Mocked tests that run without external services.

cargo test

🦙

Live Ollama Tests

Tests against a real Ollama instance for validation.

just test-ignored

🌐

API Tests (Hurl)

End-to-end API testing using Hurl test runner.

just hurl

Get Started with A.R.E.S

Open source and MIT licensed. Clone the repository, configure your agents, and deploy production-grade AI infrastructure.

Open Source

A.R.E.S

Agentic Retrieval Enhanced Server

A production-grade agentic chatbot server built in Rust with multi-provider LLM support, tool calling, RAG, MCP integration, and advanced research capabilities.

Quick Start

# Clone and setup

git clone https://github.com/dirmacs/ares

cd ares

cp .env.example .env

# Start Ollama (local LLM)

ollama pull ministral-3:3b

# Build and run

cargo build

cargo run

# Server runs on http://localhost:3000

Scroll

What is A.R.E.S?

Multi-Provider

Run agents from Ollama, OpenAI, or LlamaCpp on a single platform. No vendor lock-in.

Local-First

Runs entirely locally with Ollama and SQLite by default. No external dependencies required.

Open Source

MIT licensed and available on GitHub. Free to use, modify, and self-host forever.

Capabilities

Core Features

Everything you need to build and deploy production-grade AI agents.

Multi-Agent Orchestration

Coordinate multiple AI agents with configurable workflows and specialized routing.

Tool Calling

Type-safe function calling with automatic schema generation and per-agent tool filtering.

Enterprise Security

JWT-based authentication with Argon2 password hashing and API key support.

Streaming Responses

Real-time token streaming from all supported LLM providers.

RAG & Knowledge Bases

Pluggable knowledge bases with semantic search via SQLite and Qdrant.

Deep Research

Multi-step research capabilities with built-in web search and document retrieval.

Hot Configuration

TOML-based declarative configuration with automatic hot-reloading.

Workflow Engine

Declarative workflow execution with agent routing and circular reference detection.

Flexibility

Multi-Provider LLM Support

A.R.E.S supports multiple LLM providers out of the box. Switch between providers without changing your code.

Supported Providers

🦙

Ollama

Local LLM inference

Default

🤖

OpenAI

GPT-4 and compatible APIs

⚡

LlamaCpp

Direct GGUF model loading

GPU Acceleration

🟢

CUDA

NVIDIA GPU acceleration

🍎

Metal

Apple Silicon (macOS)

🔷

Vulkan

Cross-platform GPU

LlamaCpp integration supports direct GGUF model loading with optional GPU acceleration via CUDA, Metal, or Vulkan backends.

Configuration

Declarative TOML Config

Configure providers, models, agents, tools, and workflows entirely through TOML. Changes are automatically detected and applied without restarting the server.

Hot-reloading within 500ms
Circular reference detection
Environment variable support
Unused config warnings

ares.toml

# LLM Providers

[providers.ollama-local]

type = "ollama"

base_url = "http://localhost:11434"

default_model = "ministral-3:3b"

# Models with parameters

[models.fast]

provider = "ollama-local"

temperature = 0.7

max_tokens = 256

# Agents with tool filtering

[agents.research]

model = "fast"

tools = ["web_search", "calculator"]

Storage

Database Backends

Choose the right database for your needs. Local SQLite by default, with optional cloud and vector database support.

SQLite (libsql)

Local-first, zero configuration

Default

Turso

Remote edge database

Qdrant

Vector database for semantic search

Tools

Built-in Tools

Ready-to-use tools with per-agent filtering.

🧮

Calculator

Basic arithmetic operations with type-safe execution

🔍

Web Search

Built-in search via Daedra - no API keys required

Web search powered by Daedra , no API keys required.

API

RESTful API

Interactive Swagger UI documentation included. JWT-based authentication with refresh tokens.

Chat Endpoint

/api/chat

curl -X POST http://localhost:3000/api/chat \

-H "Authorization: Bearer <token>" \

-H "Content-Type: application/json" \

-d '{

"message": "What products do we have?",

"agent_type": "product"

Deep Research

/api/research

curl -X POST http://localhost:3000/api/research \

-H "Authorization: Bearer <token>" \

-H "Content-Type: application/json" \

-d '{

"query": "Analyze market trends",

"depth": 3,

"max_iterations": 5

Workflow Execution

/api/workflows

curl -X POST http://localhost:3000/api/workflows/default \

-H "Authorization: Bearer <token>" \

-H "Content-Type: application/json" \

-d '{

"query": "Q4 product sales figures?"

Workflow Response

Response

{

"final_response": "Based on Q4 data...",

"steps_executed": 3,

"agents_used": ["router", "sales", "product"],

"reasoning_path": [...]

}

System Design

Architecture

A modular, thread-safe architecture designed for production workloads.

AresConfigManager

ProviderRegistry

AgentRegistry

ToolRegistry

WorkflowEngine

Configuration Layer

Thread-safe config management with hot-reloading. TOML-based declarative configuration.

Agent Layer

ConfigurableAgents with per-agent tool filtering. Multi-agent orchestration with specialized routing.

Execution Layer

Workflow engine executes declarative workflows. Axum-based API with /api/chat, /api/research, /api/workflows.

Benefits

Why A.R.E.S?

Built in Rust for extreme speed and memory safety
Production-ready with comprehensive error handling
Local-first development with Ollama and SQLite by default
OpenAPI automatic documentation generation
Comprehensive unit and integration tests
MIT licensed - open source forever

Requirements

🦀Rust

1.91+

🦙Ollama

Recommended

⚡just

Command runner

Server runs on port 3000 by default. Swagger UI available at /swagger-ui/

Get Started

View Source Code

Read Documentation

MIT LicensedOpen Source

Quality Assurance

Comprehensive Testing

A.R.E.S includes unit tests, integration tests, and end-to-end API tests.

🧪

Unit & Integration

Mocked tests that run without external services.

cargo test

🦙

Live Ollama Tests

Tests against a real Ollama instance for validation.

just test-ignored

🌐

API Tests (Hurl)

End-to-end API testing using Hurl test runner.

just hurl

Get Started with A.R.E.S

Open source and MIT licensed. Clone the repository, configure your agents, and deploy production-grade AI infrastructure.