Complete Jupyter AI with FAISS and Ollama on Milk-V Megrez 32GB - Full Guide
Platform: Milk-V Megrez (32GB RISC-V)
OS: Debian rockos
I’ve successfully deployed a full AI development environment on the Milk-V Megrez 32GB board. Here’s my complete guide and lessons learned. It took me several attempts so this document should be considered a guide not an explicit recipe. The way my setup works is I have a ssh terminal connected to the megrez, I use my laptop browser to access jupyter lab running on the megrez board that uses my openwebui interface on the laptop to access ollama for the jupyter ai interface, all local. it was painful to get working but I feel that it was worth the effort.
WHAT WORKS:
- Jupyter AI with magic commands
- FAISS vector database (built from source)
- Ollama with DeepSeek-Coder model (runs on my laptop, used by jupyter ai on the megrez board)
- OpenWebUI (with port mapping adjustments) (runs on my laptop)
- Full RISC-V compatibility
THE BUILD PROCESS: CRITICAL STEPS
- The Jupyter Shell Build Trick
The Problem: Jupyter AI has complex dependencies that conflict when building normally on RISC-V.
The Solution: Build within a Jupyter shell session:
First, establish Rust environment
curl --proto ‘=https’ --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
THEN launch Jupyter and build within the notebook
jupyter lab --ip=0.0.0.0 -nobrowser --port=8888
In a jupyter shell:
pip install jupyter-ai --no-deps
CRITICAL STEP AFTER INSTALLATION: Exit and run
!exit
Then run in terminal: pip install faiss-cpu --no-deps
jupyter lab build
Why this works: Jupyter provides a stabilized environment that handles dependency resolution better than bare shell. The jupyter lab build command rebuilds the frontend with the new AI extensions.
- FAISS Build Process for RISC-V
Building FAISS from source is mandatory since no pre-built wheels exist for RISC-V:
Clone and build
git clone GitHub - facebookresearch/faiss: A library for efficient similarity search and clustering of dense vectors.
cd faiss
mkdir build && cd build
!pip install faiss-cpu --no-deps
Critical CMake configuration for RISC-V
cmake .. -DFAISS_ENABLE_PYTHON=ON -DFAISS_OPT_LEVEL=generic
-DCMAKE_CXX_FLAGS=“-march=rv64gc” -DBUILD_TESTING=OFF
Build with limited cores (memory constraints)
make -j4
Install Python bindings
cd faiss/python
pip install .
Key Flags:
- march=rv64gc: Optimizes for our specific RISC-V architecture
- j4: Uses 4 cores (adjust based on your RAM)
- BUILD_TESTING=OFF: Reduces build dependencies
- Ollama Setup & Testing
Installation:
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve &
Model Deployment:
Pull DeepSeek-Coder
ollama pull deepseek-coder:latest
Test directly
ollama run deepseek-coder “Write factorial function in Python”
- PORT OWNERSHIP WARNING: Only One Process Per Port
This is critical: Only one process can bind to a port at a time. If you’re running multiple services, you must:
- Use different ports for each service
- OR use reverse proxy configuration
- OR stop one service before starting another
Common port conflicts:
- Jupyter Lab: 8888
- Ollama: 11434
- OpenWebUI: 8080
Solution: Map services to different ports or use a reverse proxy.
- OpenWebUI Configuration with Port Mapping
I got OpenWebUI working but it required careful port management:
Install OpenWebUI
curl -fsSL https://openwebui.com/install.sh | bash
But you must modify the port if other services are running
Edit the docker-compose or systemd service to use a different port
Example: Change OpenWebUI to port 8081
In docker-compose.yml or service configuration:
ports:
- “8081:8080”
Then access at http://your-ip:8081
The key is ensuring no port conflicts between Jupyter, Ollama, and OpenWebUI.
- Network Configuration: Base IP Setting
Critical for Jupyter AI: You MUST set the base IP in configuration:
~/.jupyter/jupyter_server_config.py
c.ServerApp.ip = ‘0.0.0.0’ # Listen on all interfaces
c.ServerApp.allow_origin = ‘*’ # Allow cross-origin
c.ServerApp.base_url = ‘/your-prefix’ # If behind reverse proxy
For Ollama integration
c.AiHandler.model_provider = “ollama”
c.AiHandler.model_name = “deepseek-coder:latest”
Why this matters: Jupyter AI makes internal API calls that fail if IP binding isn’t correct.
CURRENT LIMITATION: /generate COMMAND NOT WORKING
Status: The /generate command fails with Pydantic validation errors despite everything else working.
Error Pattern:
ValidationError: 1 validation error for Outline
Input should be a valid dictionary or instance of Outline
Workaround: Use direct code generation:
Instead of /generate, use:
%%ai
Write a notebook about machine learning basics
Or use manual implementation:
def manual_generate(prompt):
Your custom implementation here
return generated_code
AI-POWERED TROUBLESHOOTING METHODOLOGY
When you hit walls, use AI to help troubleshoot:
-
Error Analysis with AI
error_message = “”“Your full error message here”“”
prompt = f"““Analyze this error and suggest RISC-V specific solutions: {error_message}””"
response = ask_ollama(prompt)
print(response) -
Dependency Resolution
prompt = “”"Resolve these conflicting dependencies for RISC-V:
- package-a==1.2 requires libxyz>=2.0
- package-b==3.4 requires libxyz<=1.8
Suggest specific version combinations that might work.“”"
- Build Optimization
prompt = “”“Optimize this CMake command for RISC-V with 32GB RAM:
cmake .. -DFAISS_ENABLE_PYTHON=ON -DCMAKE_BUILD_TYPE=Release
Suggest flags for our architecture.”“”
SUCCESSFUL WORKING COMPONENTS
What DEFINITELY works:
- FAISS vector database with similarity search
- Ollama with DeepSeek-Coder model
- Jupyter Lab with AI magic commands (%ai)
- Direct Ollama API integration
- OpenWebUI with proper port mapping
- Full RISC-V compatibility
What needs work:
- /generate command implementation
- Some Jupyter AI advanced features
STEP-BY-STEP DEPLOYMENT GUIDE
- Start with clean Debian installation
- Install Rust environment first
- Build FAISS from source with RISC-V flags
- Install Ollama and pull DeepSeek-Coder
- Install Jupyter AI within Jupyter shell
- EXIT AND RUN: jupyter lab build
- Configure network settings properly
- Set up OpenWebUI with unique port mapping
- Test with direct API calls before UI
KEY LESSONS LEARNED
- Build order matters - Rust first, then Python, then Jupyter
- Memory management is better with 32GB but still needs care
- Port management is critical - only one process per port
- The jupyter lab build step after installation is mandatory
- Direct API calls are more reliable than complex UI frameworks
- AI-assisted troubleshooting is incredibly powerful
NEXT STEPS
- Fix /generate command validation issues
- Experiment with larger models (thanks to 32GB RAM)
- Develop RISC-V optimized build scripts
- Create community documentation
- Explore multi-model deployments with Ollama
Hardware Used: Milk-V Megrez (32GB)
Model: DeepSeek-Coder:latest
Performance: Good for development with 32GB RAM
Stability: Solid once configured properly
This setup proves that serious AI development is possible on RISC-V hardware! The extra 32GB RAM makes a significant difference in what we can deploy.
Let’s keep pushing the boundaries of what’s possible on RISC-V!
Feel free to reach out with questions or share your own experiences. Together we can make RISC-V a first-class citizen in the AI ecosystem.