ConceptViz: User Guide & Tutorial

📋 Table of Contents

🚀 Getting Started & Setup
📖 Step-by-Step Tutorial: Exploring Superhero Features
🆘 Troubleshooting

🚀 Getting Started

System Requirements

⚠️ Display Requirements: This application is optimized for 2K displays (2560×1440). Other resolutions may result in layout issues.

Installation & Setup

1. Backend Setup

cd backend
python run.py
# Backend will run on http://localhost:5000

2. Port Mapping (Required)

# Map backend's port 5000 to frontend's expected port 6006
ssh -L 6006:localhost:5000 localhost
# Keep this terminal open while using the system

3. Frontend Setup

cd frontend
npm run dev
# Frontend will run on http://localhost:3000

💡 Quick Check: Open http://localhost:3000 in your browser. You should see the ConceptViz interface load successfully.

📖 Complete Tutorial: Exploring Superhero Features

Follow this comprehensive example to explore superhero-related concepts in Large Language Models using ConceptViz's Identification → Interpretation → Validation workflow. This tutorial demonstrates how to discover, analyze, and validate meaningful concept representations in LLMs.

🎮 Interactive Demo

Experience the complete workflow hands-on with our live demonstration:

🚀 Launch Interactive Demo

💡 Tip: Open the demo in a separate tab and follow along with the tutorial steps below to see each visualization fill in progressively!

📋 Demo Note: This demonstration uses pre-computed data from the superhero analysis for easy exploration. For full functionality with custom queries and models, please deploy ConceptViz locally following our installation guide.

🔍 Concept Query View - Define Your Concept

What you'll do: Start by querying "superhero" to explore superhero-related features in the model.

🎮 Try it: Launch the demo and enter "superhero" in the query box to begin.

Query optimization: Enter "superhero" and observe system suggestions for query refinement

Enter your concept: Type "superhero" in the query input field
Observe optimization: The system automatically suggests query improvements based on semantic analysis
Model selection: Use the dropdown menu to switch between different base models if needed

Model selection: Choose from available models in the dropdown menu

💡 Pro Tip: The system's query optimization helps refine ambiguous concepts. For "superhero", it might suggest more specific terms like "superhero characters and narratives" to improve feature matching.

🎯 SAE Discovery View - Find Relevant Models

What you'll do: Select the most relevant SAE model based on concept-relevance metrics across different layers.

🎮 Try it: In the demo, examine the layer rankings and click on layer 11 to load the corresponding SAE.

SAE selection: Choose the most relevant SAE model for superhero concepts

Review layer rankings: Examine the heat map showing concept relevance across all network layers
Identify optimal layers: Look for layers with the highest relevance scores (darker blue bars)
Select SAE model: Click on the interested layer to load the corresponding SAE
Proceed to exploration: Move to feature exploration once the SAE is loaded

💡 Advanced Tip: Click on the "Metrics" column header to access additional pre-computed SAE metrics for more detailed model comparison and custom selection criteria.

🗺️ Feature Explorer View - Navigate the Semantic Space

What you'll do: Explore the 2D semantic space to identify superhero-related features and understand their clustering patterns.

🎮 Try it: In the demo, navigate the 2D visualization and use the sidebar to explore features ranked by similarity to your superhero query, and select the top feature 6610.

Feature selection: Navigate the semantic space and select features of interest

Navigate the visualization: Pan and zoom to explore different semantic regions
Identify concept clusters: Look for areas where superhero-relevant features (blue points) concentrate
Select features: Click on individual feature points to examine their details
Use semantic labels: Reference cluster topic labels like "individuals", "power", or "spiritual"

Sidebar exploration: Use the expandable sidebar to browse features ranked by similarity

Expand sidebar: Open the left sidebar to see features ranked by similarity to your query
Scroll through rankings: Browse features ordered by relevance scores
Observe highlighting: Notice how the main view highlights the currently focused feature in red

💡 Navigation Strategy: Focus on the darker blue points in the main visualization - these represent features most relevant to your query. Use the sidebar to explore the complete top-K feature rankings and quickly jump to specific highly-ranked features.

🔬 Feature Details View - Deep Dive Analysis

What you'll do: Examine detailed semantic information about selected superhero-related features.

🎮 Try it: In the demo, click on a feature point to open the detailed analysis panel and examine vocabulary projections and activation patterns. Don't forget the Button 'VALIDATE'.

Feature analysis: Examine vocabulary projections, activation patterns, and explanations

Read feature explanations: Review the automated description (e.g., "references to superhero characters and their narratives")
Examine vocabulary space: Check which tokens this feature most strongly promotes or suppresses in the model's predictions
Analyze activation patterns: Use the activation-similarity matrix to identify potential explanation discrepancies
Review token statistics: Study the maximum activation tokens to understand feature behavior
Identify anomalies: Look for high-activation samples with low semantic similarity to the explanation

⚡ Input Activation View - Test with Custom Inputs

What you'll do: Validate feature behavior by testing with custom superhero-related text inputs.

🎮 Try it: In the demo, test custom superhero inputs to see token activations and co-activating features (Unfortunately, the data has been preset, please feel free to enter).

💡 Before Testing: Click the "Validation" button in the Feature Analysis view to add the current feature to your test history. This tells the system which feature to validate during input testing.

Custom input testing: Test feature responses to superhero-related prompts and discover co-activating features

Enter test sentences: Type superhero-related text like "My favorite hero is Batman"
Observe token activations: See which words trigger the strongest feature responses
Select interesting tokens: Click on tokens with high activation (e.g., "Superman", "Batman")
Discover co-activating features: System highlights other features that respond to selected tokens
Switch between features: Use the test history dropdown to compare different feature responses

Before: Initial Feature Explorer view

After: Highlighted co-activating features using bubble sets and red triangles

📝 Feature Discovery: When you select tokens like "Superman", the system reveals other features that also activate for superhero names, helping you discover semantically related feature sets and understand the model's internal representation of heroic concepts.

🎛️ Output Steering View - Causal Validation

What you'll do: Verify causal relationships by manipulating feature activations and observing their impact on model outputs (Sorry, we also have pre-made data here).

🎮 Try it: In the demo, set up steering prompts and adjust activation sliders to observe how feature manipulation affects text generation.

Activation steering: Control feature influence and observe effects on text generation

Set up steering prompt: Enter an incomplete sentence like "My favorite hero is"
Configure steering strength: Adjust sliders to set different activation levels (positive/negative)
Generate completions: Observe how different steering strengths affect the output
Compare results: Analyze differences between steered and unsteered generations
Validate causality: Confirm that steering toward superhero features produces superhero-related completions

⚠️ Expected Results: With positive steering on superhero features, you should see completions like "Dr. Strange" or "Spider Man". Negative steering should reduce superhero-related completions, demonstrating the feature's causal influence on model behavior.

💡 Advanced Analysis: Test multiple prompts and steering strengths to build confidence in your feature interpretations.

🎯 Tutorial Summary: You've now completed the full ConceptViz workflow! You've successfully:

Identified superhero-related features through concept querying and SAE discovery
Interpreted feature semantics through spatial exploration and detailed analysis
Validated feature behavior through custom input testing and causal steering

This methodology can be applied to explore any concept of interest in Large Language Models.

ConceptViz User Guide

📋 Table of Contents

🚀 Getting Started

System Requirements

Installation & Setup

1. Backend Setup

2. Port Mapping (Required)

3. Frontend Setup

📖 Complete Tutorial: Exploring Superhero Features

🎮 Interactive Demo

🔍 Concept Query View - Define Your Concept

🎯 SAE Discovery View - Find Relevant Models

🗺️ Feature Explorer View - Navigate the Semantic Space

🔬 Feature Details View - Deep Dive Analysis

⚡ Input Activation View - Test with Custom Inputs

🎛️ Output Steering View - Causal Validation

🆘 Troubleshooting

Common Issues

❌ Frontend can't connect to backend

🖥️ Display issues (layout problems)