Jupyter AI with FAISS and Ollama on Milk-V Megrez 32GB

Complete Jupyter AI with FAISS and Ollama on Milk-V Megrez 32GB - Full Guide

Platform: Milk-V Megrez (32GB RISC-V)
OS: Debian rockos

I’ve successfully deployed a full AI development environment on the Milk-V Megrez 32GB board. Here’s my complete guide and lessons learned. It took me several attempts so this document should be considered a guide not an explicit recipe. The way my setup works is I have a ssh terminal connected to the megrez, I use my laptop browser to access jupyter lab running on the megrez board that uses my openwebui interface on the laptop to access ollama for the jupyter ai interface, all local. it was painful to get working but I feel that it was worth the effort.

WHAT WORKS:

  • Jupyter AI with magic commands
  • FAISS vector database (built from source)
  • Ollama with DeepSeek-Coder model (runs on my laptop, used by jupyter ai on the megrez board)
  • OpenWebUI (with port mapping adjustments) (runs on my laptop)
  • Full RISC-V compatibility

THE BUILD PROCESS: CRITICAL STEPS

  1. The Jupyter Shell Build Trick

The Problem: Jupyter AI has complex dependencies that conflict when building normally on RISC-V.

The Solution: Build within a Jupyter shell session:

First, establish Rust environment

curl --proto ‘=https’ --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env

THEN launch Jupyter and build within the notebook

jupyter lab --ip=0.0.0.0 -nobrowser --port=8888

In a jupyter shell:

pip install jupyter-ai --no-deps

CRITICAL STEP AFTER INSTALLATION: Exit and run

!exit

Then run in terminal: pip install faiss-cpu --no-deps

jupyter lab build

Why this works: Jupyter provides a stabilized environment that handles dependency resolution better than bare shell. The jupyter lab build command rebuilds the frontend with the new AI extensions.

  1. FAISS Build Process for RISC-V

Building FAISS from source is mandatory since no pre-built wheels exist for RISC-V:

Clone and build

git clone GitHub - facebookresearch/faiss: A library for efficient similarity search and clustering of dense vectors.
cd faiss
mkdir build && cd build

!pip install faiss-cpu --no-deps

Critical CMake configuration for RISC-V

cmake .. -DFAISS_ENABLE_PYTHON=ON -DFAISS_OPT_LEVEL=generic
-DCMAKE_CXX_FLAGS=“-march=rv64gc” -DBUILD_TESTING=OFF

Build with limited cores (memory constraints)

make -j4

Install Python bindings

cd faiss/python
pip install .

Key Flags:

  • march=rv64gc: Optimizes for our specific RISC-V architecture
  • j4: Uses 4 cores (adjust based on your RAM)
  • BUILD_TESTING=OFF: Reduces build dependencies
  1. Ollama Setup & Testing

Installation:
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve &

Model Deployment:

Pull DeepSeek-Coder

ollama pull deepseek-coder:latest

Test directly

ollama run deepseek-coder “Write factorial function in Python”

  1. PORT OWNERSHIP WARNING: Only One Process Per Port

This is critical: Only one process can bind to a port at a time. If you’re running multiple services, you must:

  • Use different ports for each service
  • OR use reverse proxy configuration
  • OR stop one service before starting another

Common port conflicts:

  • Jupyter Lab: 8888
  • Ollama: 11434
  • OpenWebUI: 8080

Solution: Map services to different ports or use a reverse proxy.

  1. OpenWebUI Configuration with Port Mapping

I got OpenWebUI working but it required careful port management:

Install OpenWebUI

curl -fsSL https://openwebui.com/install.sh | bash

But you must modify the port if other services are running

Edit the docker-compose or systemd service to use a different port

Example: Change OpenWebUI to port 8081

In docker-compose.yml or service configuration:

ports:

  • “8081:8080”

Then access at http://your-ip:8081

The key is ensuring no port conflicts between Jupyter, Ollama, and OpenWebUI.

  1. Network Configuration: Base IP Setting

Critical for Jupyter AI: You MUST set the base IP in configuration:

~/.jupyter/jupyter_server_config.py

c.ServerApp.ip = ‘0.0.0.0’ # Listen on all interfaces
c.ServerApp.allow_origin = ‘*’ # Allow cross-origin
c.ServerApp.base_url = ‘/your-prefix’ # If behind reverse proxy

For Ollama integration

c.AiHandler.model_provider = “ollama”
c.AiHandler.model_name = “deepseek-coder:latest”

Why this matters: Jupyter AI makes internal API calls that fail if IP binding isn’t correct.

CURRENT LIMITATION: /generate COMMAND NOT WORKING

Status: The /generate command fails with Pydantic validation errors despite everything else working.

Error Pattern:
ValidationError: 1 validation error for Outline
Input should be a valid dictionary or instance of Outline

Workaround: Use direct code generation:

Instead of /generate, use:

%%ai
Write a notebook about machine learning basics

Or use manual implementation:
def manual_generate(prompt):

Your custom implementation here

return generated_code

AI-POWERED TROUBLESHOOTING METHODOLOGY

When you hit walls, use AI to help troubleshoot:

  1. Error Analysis with AI
    error_message = “”“Your full error message here”“”
    prompt = f"““Analyze this error and suggest RISC-V specific solutions: {error_message}””"
    response = ask_ollama(prompt)
    print(response)

  2. Dependency Resolution
    prompt = “”"Resolve these conflicting dependencies for RISC-V:

  • package-a==1.2 requires libxyz>=2.0
  • package-b==3.4 requires libxyz<=1.8
    Suggest specific version combinations that might work.“”"
  1. Build Optimization
    prompt = “”“Optimize this CMake command for RISC-V with 32GB RAM:
    cmake .. -DFAISS_ENABLE_PYTHON=ON -DCMAKE_BUILD_TYPE=Release
    Suggest flags for our architecture.”“”

SUCCESSFUL WORKING COMPONENTS

What DEFINITELY works:

  1. FAISS vector database with similarity search
  2. Ollama with DeepSeek-Coder model
  3. Jupyter Lab with AI magic commands (%ai)
  4. Direct Ollama API integration
  5. OpenWebUI with proper port mapping
  6. Full RISC-V compatibility

What needs work:

  1. /generate command implementation
  2. Some Jupyter AI advanced features

STEP-BY-STEP DEPLOYMENT GUIDE

  1. Start with clean Debian installation
  2. Install Rust environment first
  3. Build FAISS from source with RISC-V flags
  4. Install Ollama and pull DeepSeek-Coder
  5. Install Jupyter AI within Jupyter shell
  6. EXIT AND RUN: jupyter lab build
  7. Configure network settings properly
  8. Set up OpenWebUI with unique port mapping
  9. Test with direct API calls before UI

KEY LESSONS LEARNED

  1. Build order matters - Rust first, then Python, then Jupyter
  2. Memory management is better with 32GB but still needs care
  3. Port management is critical - only one process per port
  4. The jupyter lab build step after installation is mandatory
  5. Direct API calls are more reliable than complex UI frameworks
  6. AI-assisted troubleshooting is incredibly powerful

NEXT STEPS

  1. Fix /generate command validation issues
  2. Experiment with larger models (thanks to 32GB RAM)
  3. Develop RISC-V optimized build scripts
  4. Create community documentation
  5. Explore multi-model deployments with Ollama

Hardware Used: Milk-V Megrez (32GB)
Model: DeepSeek-Coder:latest
Performance: Good for development with 32GB RAM
Stability: Solid once configured properly

This setup proves that serious AI development is possible on RISC-V hardware! The extra 32GB RAM makes a significant difference in what we can deploy.

Let’s keep pushing the boundaries of what’s possible on RISC-V!

Feel free to reach out with questions or share your own experiences. Together we can make RISC-V a first-class citizen in the AI ecosystem.