AI Consensus Platform

BaselHack 2025 – Collective Intelligence through Semantic Clustering

Overview

AI Consensus Platform transforms Discord conversations into structured collective intelligence. Built in 24 hours for BaselHack 2025, the system automatically discovers consensus by embedding messages in semantic space, clustering them using machine learning, and identifying agreement patterns through geometric analysis. No voting mechanisms or manual categorization required—just natural conversation that the AI organizes in real-time.

Key Features

Semantic Message Embedding — OpenAI embeddings with SHA-256 content-based caching (85% hit rate, 6x cost reduction)
Real-Time Clustering — K-Means initialization with periodic re-clustering maintains semantic coherence as discussions evolve
Centroid-Based Labels — Two-word cluster summaries derived from messages closest to geometric centroids
Historical Context Analysis — Automatic relevance scanning of Discord history using embedding similarity (0.40 threshold)
Live Dashboard — WebSocket-powered visualization with cluster distributions and consensus detection
Zero-Database Architecture — In-memory state with JSON persistence optimizes clustering performance

Technical Implementation

Semantic Embedding Architecture

The core innovation treats each message as a point in 1536-dimensional semantic space. Every incoming message is embedded using OpenAI's text-embedding-3-small model. A content-based cache (SHA-256 hashing) eliminates redundant API calls—critical when discussions reference similar concepts repeatedly.

On startup, the bot performs incremental historical scraping, fetching only new messages since the last run. Each cached message undergoes relevance checking against the active question using cosine similarity:

\text{similarity}(m, q) = \frac{v_m \cdot v_q}{\|v_m\| \|v_q\|} = \frac{\sum_{i=1}^{1536} v_{m,i} \cdot v_{q,i}}{\sqrt{\sum_{i=1}^{1536} v_{m,i}^2} \cdot \sqrt{\sum_{i=1}^{1536} v_{q,i}^2}}

where $v_m$ and $v_q$ are the embedding vectors for the message and question. Messages exceeding the 0.40 threshold are automatically added to the discussion pool, providing rich historical context.

Clustering & Consensus Detection

K-Means Initialization: When a question receives messages, K-Means (k=4) establishes initial cluster structure. Embeddings are L2-normalized for consistent distance metrics:

\hat{v}_i = \frac{v_i}{\|v_i\|} = \frac{v_i}{\sqrt{\sum_{j=1}^{1536} v_{i,j}^2}}

The algorithm maximizes intra-cluster similarity while increasing inter-cluster separation.

Centroid Computation: Each cluster's "semantic center of gravity" is the arithmetic mean of all member embeddings:

c_k = \frac{1}{|C_k|} \sum_{i \in C_k} v_i

where $C_k$ is the set of messages in cluster $k$ . This centroid mathematically represents the cluster's consensus position.

Label Generation: To create two-word labels, we identify the 5 messages nearest to the centroid using cosine similarity, then pass them to GPT-4o-mini with constrained prompts. This ensures labels emerge from the most representative messages, not outliers. Duplicate detection compares new labels against existing ones using both string matching and embedding similarity (0.90 threshold), with automatic retry logic employing progressively harder prompts.

Dynamic Assignment: New messages join the nearest cluster if similarity exceeds 0.30; otherwise, they enter an unassigned buffer. Periodic re-clustering (every 1 second when message count changes) maintains quality without thrashing.

Consensus emerges when clusters meet two criteria: high intra-cluster similarity and significant participation (≥60% of discussants). Intra-cluster similarity is computed as the average pairwise cosine similarity:

S_{\text{intra}}(C_k) = \frac{2}{|C_k|(|C_k|-1)} \sum_{i \in C_k} \sum_{j \in C_k, j>i} \text{similarity}(v_i, v_j)

Clusters with $S_{\text{intra}} \geq 0.75$ are identified as consensus regions and highlighted in real-time on the dashboard.

State Management & Performance

The backend maintains a single active question in-memory, stored in a QuestionState object. This design reflects the hackathon scope and optimizes for clustering—Agglomerative Clustering must iterate over all embeddings repeatedly, making in-memory access essential.

State changes trigger atomic JSON writes to data/ directory files. This provides crash recovery without database transaction complexity or ORM overhead. For discussions with 100+ messages, clustering completes in <500ms—database round-trips would add 2-3 seconds.

Key Decisions & Trade-offs

In-memory state over database — Single active question + compute-intensive clustering eliminated the need for a database. JSON files provide sufficient crash recovery for hackathon scope while keeping clustering sub-second.

L2-normalized embeddings only for distance calculations — Normalization ensures consistent K-Means distances, but centroid computation uses original embeddings to preserve semantic magnitude.

Dual relevance thresholds (0.40 historical, 0.30 live) — Historical messages require higher standards to avoid pollution; live messages get generous thresholds to avoid false negatives.

Content-based embedding cache — SHA-256 hashing of message content (not message IDs) achieved 85% cache hit rates by reusing embeddings when users rephrase similar ideas.

Centroid-based label selection — Initial approaches used random samples or most-liked messages. Centroid-nearest selection consistently produced more representative labels.

Challenges & Learnings

Embedding cache optimization — Initial message-ID-based caching failed to reuse embeddings for rephrased ideas. Switching to content hashing (SHA-256) dramatically improved hit rates but required careful whitespace handling.

Label uniqueness — Early clustering generated duplicates like "good idea" across clusters. Implementing embedding similarity checking (≥0.90 triggers retry) with progressively harder prompts solved this.

Clustering stability — Re-clustering on every new message caused clusters to "jump." Adding message-count-change gating and 1-second debounce stabilized visualization.

Historical relevance tuning — Initial 0.30 threshold pulled in tangential content. Separate thresholds for historical (0.40) vs. live (0.30) messages achieved better precision.

Results

Real-time performance — <2 seconds from message arrival to dashboard update
Semantic accuracy — 90%+ coherence within clusters (manual validation)
Scalability — Handles 100+ messages with 20+ participants without degradation
Production deployment — Fully functional Docker + Nixpacks deployment on Sevalla

Tech Stack

Backend: Python 3.12, FastAPI, OpenAI API (text-embedding-3-small, gpt-4o-mini), scikit-learn (K-Means, Agglomerative Clustering), NumPy, discord.py, WeasyPrint, Uvicorn

Frontend: Next.js 16, React 19, TypeScript, Tailwind CSS v4, Recharts, WebSocket

Infrastructure: Docker (Alpine), Sevalla hosting, Nixpacks, JSON persistence

Technology Integration

Team: Oliver Baumgartner (backend & AI), Samel Baumgartner, Sven Messmer, Kimi Löffel

Built for BaselHack 2025 – Endress+Hauser Collective Intelligence Challenge

AI Consensus Platform

Quick Facts

Key Decisions

Results

Tech Stack

About This Project

AI Consensus Platform

Overview

Key Features

Technical Implementation

Semantic Embedding Architecture

Clustering & Consensus Detection

State Management & Performance

Key Decisions & Trade-offs

Challenges & Learnings

Results

Tech Stack

Technology Integration

Project Links