NeuroForge User Guide
Welcome to NeuroForge — the world's first AI Battle Arena. A platform designed to help you compare, analyze, and challenge the world's most advanced AI models head-to-head in real time.
Whether you're an AI researcher, prompt engineer, or just a curious mind exploring how GPT, Claude, and Gemini think differently, NeuroForge turns model comparison into a visual, interactive experience.
NeuroForge isn't just a chatbot playground — it's a competitive benchmarking system. You send one prompt, and three major AI models respond simultaneously. Each response is automatically evaluated, scored, and ranked.
How It Works
When you enter a prompt in the input box, NeuroForge dispatches it to three AI models in parallel — typically GPT-4o, Claude Opus, and Gemini 2.5.
Each model's reply is collected and displayed side-by-side so you can immediately see how their reasoning, tone, and structure differ.
Automatic Scoring System
Every response is analyzed by NeuroForge's Scoring Engine, which evaluates six aspects of quality:
- Success – Did the model produce a valid, coherent response without errors?
- Speed – How fast did it generate its output relative to others?
- Completeness – Did it fully address all aspects of your question?
- Accuracy – Are the facts correct and logically consistent?
- Readability – How natural and clear is the text to read?
- Structure – Does it format and present the answer well?
Each metric contributes to a total score out of 100, and the top performer is displayed as the Round Winner with an explanation of why it won.
Exploring the Interface
The Dashboard Layout
When you log in to NeuroForge, you'll land in your main dashboard. At the top of the screen, you'll find the NeuroForge header bar, which shows your remaining prompts, lets you open the Settings panel, and provides Refresh and Logout options.
The sidebar on the left acts as your thread manager — this is where all your conversations live. Each thread represents an ongoing "battle" or conversation between the models.
Thread Management Features:
- Create a new thread using the + New button
- Rename threads directly by clicking their title
- Delete threads you no longer need
- Revisit older threads to analyze how the models performed in the past
If you start typing a prompt without having a thread selected, NeuroForge will automatically create one and name it after your first input — keeping your workspace organized.
Submitting a Prompt
Let's say you create a new thread called "Debate on Quantum Consciousness". You enter your prompt:
"Explain whether quantum mechanics could play a role in human consciousness."
Hit Continue Battle, and the real magic begins.
All three AI models — GPT, Claude, and Gemini — receive your question at the same moment. They start "thinking" in parallel, and when they finish, their responses appear in neatly designed cards side-by-side.
Response Card Details:
- Model name and chosen personality
- Response time (in milliseconds)
- Tokens used
- Total word count
- Calculated score out of 100
A golden banner appears at the top announcing the winner and summarizing what made it stand out:
Response Interactions
Each model's response isn't just static text — you can interact with it:
Going over a response, you'll find several small icons in the top-right corner of its card:
- 🔍 Analyze – Triggers an "analysis round" where the other two models critique the selected response and propose improvements
- ⛶ Expand – Opens the full text in a centered modal for distraction-free reading and copying
- 📋 Copy – Instantly copies the entire response to your clipboard
Analysis & Tournament Mode
When you click Analyze, the selected model's response is critiqued by the remaining two models. They evaluate it, point out flaws, and attempt to provide a better version of the answer.
This results in one model being disqualified while the others advance. You can keep analyzing until only one model remains — the tournament winner.
Each step in the tournament consumes one prompt, but thanks to NeuroForge's optimization, you get a full 3-model tournament in just 3 prompt credits total, instead of 6.
Thread History & Round Statistics
Every thread retains its full conversation history:
Each round within a thread can be expanded or collapsed to review:
- The models' responses for that round
- The score breakdown for each AI
- The winner banner and reasoning
Older rounds are fully viewable, though the Analyze button only works on the current round — ensuring you don't accidentally trigger new analyses on archived data.
Settings & Personalities
Clicking the Settings button in the top bar opens a detailed configuration panel:
Model Selection
Choose which version of each AI to use:
- GPT: 4o Mini, 4o, 4.1, 4.1 mini, 5, 5 mini or 5 nano
- Claude: Sonnet 4, Sonnet 4.5, Opus 4, Opus 4.1 or 3 Haiku
- Gemini: 2.5 Flash or 2.5 Flash-Lite
Personality Customization
Assign different personalities to each model:
| Personality | Description |
|---|---|
| Chess Master | Speaks in strategic metaphors |
| Sarcastic | Dry humor and irony |
| Direct & Blunt | No fluff, just straight answers |
More model examples available in the app as seen in screenshot, each done with efficiency and top notch prompt engineering in mind
You can mix and match freely — for instance, pit a Sarcastic Claude against a Logical GPT and a Teacher-like Gemini to see wildly different tones and reasoning.
Prompt Limits & Tiers
NeuroForge uses a credit-based system where each action consumes "prompt units." Each tier of account includes a monthly quota:
| Plan | Monthly Prompts | Ideal For |
|---|---|---|
| Free | 3 | Quick tests |
| Basic | 150 | Regular exploration |
| Pro | 300 | Extended Use |
Prompt Consumption Rules
Prompts are consumed whenever:
- You send a new prompt to all 3 AIs (1 prompt)
- You trigger an Analyze action (1 prompt)
Prompts reset automatically every month on your billing or sign-up date and do not roll over.
Global Limits
To ensure stability and fairness, some system-wide constraints are applied:
| Limit | Value | Purpose |
|---|---|---|
| Max Token Limit | 2,500 | Total input + output per model |
| Max Response Time | 30 seconds | Timeout for model replies |
| Max Context Memory | 4 messages | Past messages remembered per thread |
| Max Thread Messages | 50 | Maximum interactions per thread |
| Max Input Length | 5,000 chars | Maximum prompt size |
If you're running a self-hosted version of NeuroForge, all these parameters can be customized or disabled entirely from the environment configuration.
Error Handling
NeuroForge handles errors gracefully. You'll always receive a clear message about what went wrong — no cryptic API dumps or console gibberish.
Common Errors:
- Input too long: Message appears instantly, and your prompt isn't consumed
- Model timeout: "⚠️ Model exceeded maximum response length or time limit"
- Prompt balance depleted: Prevented from submitting new prompts.
If an error occurs before models respond, your prompt credit is not consumed.
Security & Authentication
Access to NeuroForge is managed through secure JWT-based tokens. Each token carries your usage limits and permissions.
- Tokens can expire or be reset by administrators
- HTTPS end-to-end encryption
- Input sanitization to prevent injection attacks
- Strict separation between user and admin panels
Example Workflow
Let's walk through a complete example:
- You log in with your token and see your Welcome Dashboard, confirming your current prompt balance
- You create a thread titled "Explain relativity like a pirate"
- You type your prompt and click Start/Continue Battle
- Within seconds, GPT, Claude, and Gemini all respond — one serious, one comedic, one overly poetic
- NeuroForge scores each and declares a winner
- You click Analyze on Gemini's overly poetic response — Claude and GPT critique it mercilessly, disqualifying Gemini
- You analyze one more time,this time on Claude, disqualifying him and GPT emerges as the final winner
- Three prompt credits consumed, dozens of insights gained
An AI experiment turned into a sport.
For Developers & Self-Hosting
If you deploy NeuroForge yourself, you can:
- Host the backend on Node.js/Express with MongoDB
- Build the frontend with React
- Control rate limits, scoring weights, and model endpoints through environment variables
- Integrate your own OpenAI, Anthropic, or Google API keys
Check Self-Hosting documentation by switching the tab at the top of this page to "Admin Panel (self-hosted) for more information.
Conclusion
NeuroForge isn't just a testing tool — it's a lens into the behavior of modern language models. It helps you see how models reason differently, where they excel, and where they fail — and it makes that analysis visual, measurable, and even fun.
So go ahead — create your first thread, send your first battle prompt, and watch the AIs fight for your approval.
NeuroForge Admin Panel Guide
Self-Hosted Edition
This manual walks you through everything an administrator can do in the NeuroForge Admin Panel when running the self-hosted edition: creating access tokens, managing users in bulk, tuning global limits, reading analytics, auditing activity, and watching server health.
The Admin Panel is a control cockpit for access & usage. It does not make prompts itself. It configures and observes the environment the NeuroForge App runs in.
Generate Token (Manual User Creation)
The token generation system issues signed access tokens carrying the username, max prompts, and optional expiration.
Form Fields
| Field | Description |
|---|---|
| Username | The user's identifier (email or handle) |
| Max Prompts | Total monthly allowance |
| Expiration | ISO date (optional) - token stops working after this date |
After Clicking "Generate Token"
- A brand-new token is returned and shown below the form
- Click Copy and hand it to the user (DM, email, etc.)
If the user loses their token later, you'll need to generate a new one. For security, tokens are not stored for future viewing in the admin panel.
Good to Know
- You can create multiple tokens for the same username if needed (e.g., rolling keys)
- Expired tokens are rejected automatically
- New users will only become visible in the admin panel once they login the first time with their generated token which also activates their account on first time usage
Manage Users
This is your control room for individual accounts and bulk operations.
User Statistics Explained
| Metric | Description |
|---|---|
| Used | Prompts consumed so far (this billing cycle) |
| Remaining | maxPrompts - used (floors at 0) |
| Total | The current maxPrompts assigned |
| Usage % | round((used / maxPrompts) * 100) |
User Statuses
| Status | Color | Meaning |
|---|---|---|
| Active | 🟢 Green | Can still prompt; not near depletion |
| Low | 🟡 Yellow | Over 80% consumed; consider topping up |
| Depleted | 🔴 Red | 100% used OR deliberately terminated |
| Removed | ⚫ Gray | Permanently disabled; cannot be re-enabled |
"Depleted" includes Terminated users because termination sets maxPrompts = 0, making usage effectively 100%.
User Actions
Reset Usage
Sets used → 0, remaining → total, usage% → 0%
Example: was 10/10 used → becomes 0/10 used
Update Prompts
Changes total (maxPrompts) without altering used.
Example: user had used=5, total=10
→ update to total=20
→ now used=5, remaining=15, usage=25%
Terminate
Immediately blocks access: sets maxPrompts = 0, used = 0, status appears as Depleted.
To restore, use Update Prompts to a value > 0 (Reset Usage alone won't help because total=0).
Remove
Permanent ban: status becomes Removed, maxPrompts=0. This prevents the account from being re-created by any existing/old token.
You can hide removed users via filters, but they cannot be reactivated.
Filters, Views, and Utilities
- Search by username
- Filters: status toggles (All, Active, Low, Depleted, Removed), usage %, ranges
- Card/Table view toggle
- Refresh list
- Export CSV: downloads the currently filtered list
CSV Export Columns:
User, Prompts Used, Prompts Remaining, Total Prompts, % Usage, Status
Import CSV (Bulk User Creation)
- Click Import CSV
- Download the Example CSV to see the required format:
user,maxPrompts,expires
jane@acme.com,150,2025-12-31
dev-team,300,
expires is optional (leave empty for no expiry)
- Upload your file. You'll see:
- A progress bar
- Validation errors (e.g., "row 3: invalid date format")
- On success, a result table with a new Token column for each user
- Click Download updated CSV with tokens to save the full list
Tokens shown here are one-time view. Close the popup without downloading and you won't be able to retrieve the same tokens again.
Global Limits
These are platform-wide guardrails that apply to all users and all models. Changing them restarts the backend quickly to take effect.
Token Limits
| Setting | Description |
|---|---|
| MAX_TOTAL_TOKENS | Ceiling for input + output tokens per response |
| MAX_OUTPUT_TOKENS | Hard cap for model output length (subset of total) |
| GEMINI_OUTPUT_TOKENS | Per-model fine-tuning (Gemini tends to be verbose) |
| GEMINI_OUTPUT_ACCEPTABLE | Tolerance limit (Gemini often ignores the request limit) |
History & Context Limits
| Setting | Description |
|---|---|
| MAX_CONTEXT_MESSAGES | How many prior messages are carried forward per thread |
| MAX_MESSAGE_CHARS | Max chars per past message kept in context |
Thread Limits
| Setting | Description |
|---|---|
| MAX_THREAD_MESSAGES | Soft cap for thread length (users should start a new thread after reaching this) |
Defaults are production-safe. Raise gradually and monitor Server Monitor and Analytics for impact. Saving limits triggers a fast backend restart (seconds).
Analytics
The Analytics tab gives you a visual pulse of the system:
Available Charts
- KPI Cards: Total users, active users, removed, depleted/terminated
- Status Distribution (pie): How your user base segments across statuses
- Top 10 by Prompts Used (bar): Your highest-volume accounts
- New Users per Day (line): Last 14 days growth curve
- Total Prompts: Used vs Remaining (bars)
Watch the "Top 10" and "Used vs Remaining" alongside Global Limits—they'll tell you when to raise ceilings or add capacity.
Export Analytics CSV
One click to snapshot your current view for reporting.
System Logs (Audit Trail)
Everything important gets recorded here. You'll see:
- Admin actions: generate token, update prompts, reset usage, terminate/remove user, clear logs, import/export CSV, etc.
- System events: logins/logouts, backend restarts (from limit changes), rate-limit denials
- User signals: notable user-level system events (if enabled)
- Error events: parse failures, bad CSV rows, invalid tokens, etc.
Log Features
- Search by text
- Filter by type (All/Admin/System/User/Error)
- Export current logs to JSON
- Clear Logs (with confirmation)
Example Log Entries:
SYSTEM / Login – "Admin logged in successfully"
ADMIN / Update Prompts – "User: alice, New Prompts: 200"
ERROR / Import CSV – "Row 4: unknown column 'maxPromptz'"
Server Monitor
A clean, real-time overview for both processes:
Admin Panel Metrics (Port 3002)
- CPU Load – % usage over the last sample window
- Memory – Current RSS in MB
- MongoDB – Connection status + pool count
- API Speed – Rolling average response time
- Sessions – Active admin sessions
- Uptime – Since process start
- Node / Platform / CPU cores – Environment basics
Main Backend Metrics (Port 3001)
- Similar metrics as admin panel
- Also shows Active Threads (open user conversations)
- Shows Total Users count
Live offline detection
If you backend is down, this will be flagged here as system degraded and show as offline
Enabled by default (every 5 seconds). Use Refresh Now for a manual pulse.
If API Speed rises and CPU/Memory aren't pegged, you're likely I/O constrained (e.g., network to model providers). If CPU/Memory peg, scale vertically or horizontally.
Limits, Billing, and Prompt Consumption
How It Really Works
- One end-to-end battle prompt (all three models respond) consumes 1 prompt total
- Tournament/Analysis stages consume 1 prompt per analysis round
- The full prompt→analysis→final flow costs 3 prompts, not 6
Cycles & Resets
- User allowances reset monthly on the user's billing date
- Resetting an individual user in Manage Users sets
used=0without changing their total - Global hard limits (timeouts, tokens, memory) apply regardless of the user's remaining balance
Security & Roles
- Tokens are bearer credentials. Treat them like passwords.
- Removed accounts cannot be re-created by any old token (we mark & block).
- Terminate is reversible (set prompts > 0). Remove is final.
Best Practices
- Prefer expiring tokens for contractors/trials
- Rotate tokens if one is suspected compromised
- Export logs regularly and store them off-box
Troubleshooting
User Can't Log In
- Check System Logs for
SYSTEM / Loginerrors - Confirm token not expired; confirm user status not Removed/Terminated
- If Terminated, use Update Prompts to restore > 0
"Tokens Exceeded" or "Took Too Long" Errors
- Raise
MAX_TOTAL_TOKENSorMAX_OUTPUT_TOKENSgradually - If latency hits the 30s timeout: try lowering output caps or simplifying prompts
Import CSV Fails
- Download the Example CSV and match headers exactly:
user,maxPrompts,expires - Dates must be ISO (YYYY-MM-DD)
- Keep under a few thousand rows for smooth UX; if you plan more, split files
Analytics Show 0 But Users Exist
- Click Refresh
- Check that
/api/usersreturns data (network/auth) - If you recently changed Global Limits, wait a moment for the backend restart
Backup & Maintenance Tips
- Logs are in your admin app's
/logsdirectory. Snapshot them periodically - Use PM2 (or your process manager) to keep both Admin and Backend online
- After OS updates or Node upgrades, validate Server Monitor shows both services healthy
Requirements (Self-Hosted)
| Component | Requirement |
|---|---|
| Node.js | v18.20.8+ |
| CPU | ≥ 2 cores (scale with traffic) |
| RAM | ≥ 2 GB (4 GB recommended) |
| Disk | ≥ 500 MB (plus headroom for logs/data) |
| Database | MongoDB or MySQL |
| OS | Linux (Ubuntu, Alma, Amazon Linux, etc.) |
AWS EC2 is a great fit for hosting NeuroForge.
Support, Licensing, and Customization
- Your one-time license covers lifetime use and updates
- We provide install, update, and usage guides for both the Admin Panel and the NeuroForge App
- We also include developer docs (code structure and endpoints) for teams who want to extend NeuroForge
If you fork/modify core code, we can't guarantee support for the modified version. Keep a vanilla backup if you plan experiments.
License integrity must remain intact; tampering leads to a permanent ban.
Real-World Scenarios
A. "Client used up everything mid-demo."
Solution: Manage Users → search client → Update Prompts from 50 → 100 (keeps their used count intact), then click Reset Usage if you want to grant a clean slate.
B. "We offboarded a freelancer."
Solution: Manage Users → Terminate to pause immediately. If you need an irrevocable block (and to prevent token resurrection), Remove. Hide removed users via filter.
C. "We're shipping a long report; responses keep getting cut."
Solution: Global Limits → raise MAX_OUTPUT_TOKENS by +250 steps, keep MAX_TOTAL_TOKENS reasonable, and watch Server Monitor. If response times climb, back off by 100–200.
D. "We imported 100 users and ops wants a list of all their tokens."
Solution: Use Import CSV with your master list → validate → success → Download updated CSV with tokens → store securely (password-vault or encrypted share).
Conclusion
The NeuroForge Admin Panel gives you complete control over your self-hosted AI Battle Arena. Use it wisely, monitor regularly, and your users will have a seamless experience.
For technical support or custom development inquiries, contact your account manager or visit our support portal.