docs: design admin auth and api user management
This commit is contained in:
@@ -0,0 +1,345 @@
|
||||
# Hermes Admin Auth, API Users, Usage, and Audit Design
|
||||
|
||||
## Objective
|
||||
|
||||
Add a secure administrative access layer and a managed public API gateway to the Hermes control plane.
|
||||
|
||||
The finished system will:
|
||||
|
||||
- Protect the entire control-plane website and all management endpoints with an environment-configured admin login.
|
||||
- Let the admin create and manage API users with individually issued API keys.
|
||||
- Give each API key independent access to the pre-Hermes API, post-Hermes API, or both.
|
||||
- Enforce per-key requests-per-minute and monthly token limits.
|
||||
- Store complete request prompts and response content for 90 days.
|
||||
- Provide downloadable JSONL audit logs without displaying message content in the website.
|
||||
- Use a dedicated database on an existing PostgreSQL server.
|
||||
|
||||
## System Architecture
|
||||
|
||||
The Compose stack will contain these logical services:
|
||||
|
||||
### Control Plane
|
||||
|
||||
The existing Node control-plane server continues serving the administrative UI on port `7843`.
|
||||
|
||||
It will:
|
||||
|
||||
- Require an authenticated admin session for every page and management endpoint.
|
||||
- Provide API-user creation, editing, rotation, revocation, deletion, and JSONL-log download endpoints.
|
||||
- Display the approved API-user management table.
|
||||
- Connect to the dedicated PostgreSQL database through `DATABASE_URL`.
|
||||
|
||||
### Public API Gateway
|
||||
|
||||
A new Node gateway service will be the only externally exposed AI API.
|
||||
|
||||
It owns the public ports:
|
||||
|
||||
- `8645`: pre-Hermes OpenAI-compatible API
|
||||
- `8646`: post-Hermes OpenAI-compatible API
|
||||
|
||||
For every request, it will:
|
||||
|
||||
1. Read the bearer API key.
|
||||
2. Hash the key and find its API user.
|
||||
3. Validate active status, expiry, and pre/post permission.
|
||||
4. Enforce requests-per-minute and monthly token limits.
|
||||
5. Forward the request to the correct internal Hermes upstream.
|
||||
6. Stream the upstream response to the client while capturing it for audit storage.
|
||||
7. Store request metadata, full prompt content, full response content, status, latency, and token usage.
|
||||
8. Update the API user's last-used timestamp and usage totals.
|
||||
|
||||
### Pre-Hermes Upstream
|
||||
|
||||
The pre-Hermes service runs Hermes' native direct provider proxy:
|
||||
|
||||
```text
|
||||
hermes proxy start --provider <provider> --host 0.0.0.0 --port <internal-port>
|
||||
```
|
||||
|
||||
This path forwards OpenAI-compatible requests directly to an authenticated provider. It does not run the Hermes agent, tools, memory, or instructions.
|
||||
|
||||
The service will be reachable only on the internal Compose network.
|
||||
|
||||
### Post-Hermes Upstream
|
||||
|
||||
The post-Hermes service runs:
|
||||
|
||||
```text
|
||||
hermes gateway run
|
||||
```
|
||||
|
||||
with Hermes' full agent API enabled through:
|
||||
|
||||
```text
|
||||
API_SERVER_ENABLED=true
|
||||
API_SERVER_HOST=0.0.0.0
|
||||
API_SERVER_PORT=<internal-port>
|
||||
```
|
||||
|
||||
This path runs requests through the full Hermes agent, including configured tools, memory, instructions, and session behavior.
|
||||
|
||||
The service will be reachable only on the internal Compose network.
|
||||
|
||||
## Authentication
|
||||
|
||||
### Admin Authentication
|
||||
|
||||
The admin username and password come from:
|
||||
|
||||
```text
|
||||
HERMES_ADMIN_USERNAME
|
||||
HERMES_ADMIN_PASSWORD
|
||||
```
|
||||
|
||||
The control plane will not start in an externally accessible configuration if either value is missing or weak.
|
||||
|
||||
Login behavior:
|
||||
|
||||
- Credentials are compared using timing-safe comparisons.
|
||||
- A successful login creates a cryptographically random admin session token.
|
||||
- Only a hash of the session token is stored in PostgreSQL.
|
||||
- The browser receives the token in an HTTP-only, SameSite cookie.
|
||||
- Sessions have a configurable lifetime and can be invalidated through logout.
|
||||
- Every control-plane page, management endpoint, and application asset requires a valid session. Only the login endpoint, its dedicated minimal assets, and health endpoints are unauthenticated.
|
||||
|
||||
### API-Key Authentication
|
||||
|
||||
API keys will use a recognizable prefix and random secret, such as:
|
||||
|
||||
```text
|
||||
hms_<random-secret>
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- The plaintext key is shown only once, immediately after creation or rotation.
|
||||
- PostgreSQL stores only a SHA-256 hash and a short display suffix.
|
||||
- Rotating a key invalidates the previous key immediately.
|
||||
- Revoking or deleting an API user invalidates its key immediately.
|
||||
- Each key independently permits pre-Hermes, post-Hermes, or both.
|
||||
|
||||
## Limits and Enforcement
|
||||
|
||||
Each API user has:
|
||||
|
||||
- Requests-per-minute limit
|
||||
- Monthly token limit
|
||||
- Optional expiration timestamp
|
||||
- Active or revoked status
|
||||
- Pre-Hermes permission
|
||||
- Post-Hermes permission
|
||||
|
||||
The gateway will enforce limits using PostgreSQL transactions so behavior remains consistent across restarts and multiple gateway instances.
|
||||
|
||||
OpenAI-compatible errors:
|
||||
|
||||
- `401`: missing or invalid API key
|
||||
- `403`: valid key without permission for the selected API
|
||||
- `410`: expired or revoked API key
|
||||
- `429`: requests-per-minute or monthly token limit exceeded
|
||||
- `502`: selected Hermes upstream unavailable
|
||||
|
||||
Monthly usage is calculated by UTC calendar month.
|
||||
|
||||
Before forwarding a request, the gateway rejects keys whose recorded monthly usage has already reached the configured limit. After completion, it reconciles the request using the upstream's reported token usage. A single in-flight request can therefore finish above the monthly boundary; subsequent requests are blocked.
|
||||
|
||||
## PostgreSQL Data Model
|
||||
|
||||
The system uses a dedicated PostgreSQL database supplied through:
|
||||
|
||||
```text
|
||||
DATABASE_URL
|
||||
```
|
||||
|
||||
Required logical tables:
|
||||
|
||||
### `admin_sessions`
|
||||
|
||||
- Session token hash
|
||||
- Created timestamp
|
||||
- Expiration timestamp
|
||||
- Last-seen timestamp
|
||||
- Revoked timestamp
|
||||
|
||||
### `api_users`
|
||||
|
||||
- Stable ID
|
||||
- Display name
|
||||
- Status
|
||||
- Pre-Hermes permission
|
||||
- Post-Hermes permission
|
||||
- Requests-per-minute limit
|
||||
- Monthly token limit
|
||||
- Expiration timestamp
|
||||
- Created timestamp
|
||||
- Updated timestamp
|
||||
- Last-used timestamp
|
||||
- Revoked timestamp
|
||||
|
||||
### `api_keys`
|
||||
|
||||
- Stable ID
|
||||
- API-user ID
|
||||
- Key hash
|
||||
- Key display suffix
|
||||
- Created timestamp
|
||||
- Revoked timestamp
|
||||
|
||||
The schema supports key history while allowing only one active key per API user.
|
||||
|
||||
### `usage_events`
|
||||
|
||||
- API-user ID
|
||||
- API-key ID
|
||||
- Pre/post route
|
||||
- Request timestamp
|
||||
- Completion timestamp
|
||||
- HTTP status
|
||||
- Model
|
||||
- Prompt tokens
|
||||
- Completion tokens
|
||||
- Total tokens
|
||||
- Latency
|
||||
- Error code
|
||||
|
||||
### `message_logs`
|
||||
|
||||
- Usage-event ID
|
||||
- Full request body
|
||||
- Full captured response
|
||||
- Response content type
|
||||
- Streaming flag
|
||||
- Created timestamp
|
||||
- Automatic deletion timestamp
|
||||
|
||||
Request and response bodies are stored as JSONB when valid JSON and as text when an upstream returns non-JSON content.
|
||||
|
||||
## Log Retention and Downloads
|
||||
|
||||
Full prompt and response logs are retained for exactly 90 days.
|
||||
|
||||
A scheduled cleanup task deletes expired `message_logs` only. `usage_events` remain available so aggregate usage totals and operational audit metadata survive after message content expires.
|
||||
|
||||
Message content is never displayed in the control-plane UI.
|
||||
|
||||
An authenticated admin can download JSONL logs:
|
||||
|
||||
- For one API user
|
||||
- For all API users
|
||||
- Filtered by start and end timestamp
|
||||
- Limited to the retained 90-day window
|
||||
|
||||
Each JSONL record includes:
|
||||
|
||||
- API user ID and display name
|
||||
- Route: pre or post
|
||||
- Timestamp
|
||||
- Model
|
||||
- Request ID
|
||||
- HTTP status
|
||||
- Token counts
|
||||
- Latency
|
||||
- Full request body
|
||||
- Full response content
|
||||
- Error details when applicable
|
||||
|
||||
Downloads stream records from PostgreSQL instead of loading the entire export into memory.
|
||||
|
||||
## Admin User Interface
|
||||
|
||||
The entire control-plane website requires admin login.
|
||||
|
||||
After login, the existing visual language remains unchanged. A new `API Users` pane will use the approved structured table layout.
|
||||
|
||||
Desktop table columns:
|
||||
|
||||
- User and status
|
||||
- Masked API-key suffix
|
||||
- Access: pre, post, or both
|
||||
- Monthly token and requests-per-minute limits
|
||||
- Last used
|
||||
- Expiration
|
||||
- Actions menu
|
||||
|
||||
Actions:
|
||||
|
||||
- Create API user
|
||||
- Edit display name
|
||||
- Edit pre/post permissions
|
||||
- Edit limits
|
||||
- Set or remove expiration
|
||||
- Rotate key
|
||||
- Download JSONL logs
|
||||
- Revoke
|
||||
- Delete
|
||||
|
||||
The create and rotate flows show the plaintext key once and require the admin to acknowledge that it cannot be retrieved later.
|
||||
|
||||
On narrow screens, each table row becomes a compact stacked record with the same actions.
|
||||
|
||||
## Delete and Revoke Semantics
|
||||
|
||||
- **Revoke** disables the API user immediately but preserves the user record and allows later reactivation.
|
||||
- **Delete** permanently removes the API user from the active management list and invalidates its keys.
|
||||
- Usage and message logs for deleted users remain until the normal 90-day message-log expiration.
|
||||
- Audit records preserve the deleted user's ID and last known display name.
|
||||
|
||||
## Failure Handling
|
||||
|
||||
- Database unavailable at startup: management and public gateway services fail closed.
|
||||
- Database unavailable during a request: reject the request instead of bypassing authentication or limits.
|
||||
- Upstream unavailable: return an OpenAI-compatible `502` and record the failure event.
|
||||
- Client disconnect during streaming: stop forwarding when possible and store the partial response with a disconnect marker.
|
||||
- Log-write failure after a completed request: emit a critical service log and mark the usage event as audit-incomplete.
|
||||
- Cleanup failure: retry on the next scheduled run without blocking API traffic.
|
||||
|
||||
## Deployment Configuration
|
||||
|
||||
New required environment variables:
|
||||
|
||||
```text
|
||||
DATABASE_URL
|
||||
HERMES_ADMIN_USERNAME
|
||||
HERMES_ADMIN_PASSWORD
|
||||
```
|
||||
|
||||
Additional configurable values will include:
|
||||
|
||||
```text
|
||||
HERMES_ADMIN_SESSION_TTL_HOURS
|
||||
HERMES_LOG_RETENTION_DAYS=90
|
||||
HERMES_PRE_UPSTREAM_URL
|
||||
HERMES_POST_UPSTREAM_URL
|
||||
```
|
||||
|
||||
The pre/post native Hermes services will no longer publish their internal ports. Only the public gateway publishes `8645` and `8646`.
|
||||
|
||||
## Testing and Verification
|
||||
|
||||
Automated tests will cover:
|
||||
|
||||
- Admin login success and failure
|
||||
- Timing-safe credential checks
|
||||
- Session creation, expiry, logout, and revocation
|
||||
- Protection of control-plane HTML, application assets, and every management endpoint, while keeping only login assets and health endpoints public
|
||||
- API-user creation, editing, rotation, revocation, deletion, and expiry
|
||||
- Plaintext keys shown once and hashes stored at rest
|
||||
- Pre/post permission enforcement
|
||||
- Requests-per-minute enforcement
|
||||
- Monthly UTC token-limit enforcement
|
||||
- Non-streaming forwarding and logging
|
||||
- Streaming forwarding, capture, partial responses, and disconnect handling
|
||||
- Prompt and response JSONL downloads
|
||||
- 90-day retention cleanup
|
||||
- Database migration idempotency
|
||||
- Failure-closed behavior when PostgreSQL is unavailable
|
||||
- Compose service routing and health checks
|
||||
|
||||
Manual verification will confirm:
|
||||
|
||||
- The login screen and API-user table match the existing control-plane style.
|
||||
- API keys work against the permitted public endpoint and fail against forbidden endpoints.
|
||||
- Last-used, limits, status, and expiration display correctly.
|
||||
- Rotated, revoked, deleted, and expired keys stop working immediately.
|
||||
- Downloaded JSONL contains complete retained prompts and responses.
|
||||
Reference in New Issue
Block a user