Jupyter AI with FAISS and Ollama on Milk-V Megrez 32GB

dj53144 · September 3, 2025, 10:12am

Complete Jupyter AI with FAISS and Ollama on Milk-V Megrez 32GB - Full Guide

Platform: Milk-V Megrez (32GB RISC-V)
OS: Debian rockos

I’ve successfully deployed a full AI development environment on the Milk-V Megrez 32GB board. Here’s my complete guide and lessons learned. It took me several attempts so this document should be considered a guide not an explicit recipe. The way my setup works is I have a ssh terminal connected to the megrez, I use my laptop browser to access jupyter lab running on the megrez board that uses my openwebui interface on the laptop to access ollama for the jupyter ai interface, all local. it was painful to get working but I feel that it was worth the effort.

WHAT WORKS:

Jupyter AI with magic commands
FAISS vector database (built from source)
Ollama with DeepSeek-Coder model (runs on my laptop, used by jupyter ai on the megrez board)
OpenWebUI (with port mapping adjustments) (runs on my laptop)
Full RISC-V compatibility

THE BUILD PROCESS: CRITICAL STEPS

The Jupyter Shell Build Trick

The Problem: Jupyter AI has complex dependencies that conflict when building normally on RISC-V.

The Solution: Build within a Jupyter shell session:

First, establish Rust environment

curl --proto ‘=https’ --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env

THEN launch Jupyter and build within the notebook

jupyter lab --ip=0.0.0.0 -nobrowser --port=8888

In a jupyter shell:

pip install jupyter-ai --no-deps

CRITICAL STEP AFTER INSTALLATION: Exit and run

!exit

Then run in terminal: pip install faiss-cpu --no-deps

jupyter lab build

Why this works: Jupyter provides a stabilized environment that handles dependency resolution better than bare shell. The jupyter lab build command rebuilds the frontend with the new AI extensions.

FAISS Build Process for RISC-V

Building FAISS from source is mandatory since no pre-built wheels exist for RISC-V:

Clone and build

git clone GitHub - facebookresearch/faiss: A library for efficient similarity search and clustering of dense vectors.
cd faiss
mkdir build && cd build

!pip install faiss-cpu --no-deps

Critical CMake configuration for RISC-V

cmake .. -DFAISS_ENABLE_PYTHON=ON -DFAISS_OPT_LEVEL=generic
-DCMAKE_CXX_FLAGS=“-march=rv64gc” -DBUILD_TESTING=OFF

Build with limited cores (memory constraints)

make -j4

Install Python bindings

cd faiss/python
pip install .

Key Flags:

march=rv64gc: Optimizes for our specific RISC-V architecture
j4: Uses 4 cores (adjust based on your RAM)
BUILD_TESTING=OFF: Reduces build dependencies

Ollama Setup & Testing

Installation:
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve &

Model Deployment:

Pull DeepSeek-Coder

ollama pull deepseek-coder:latest

Test directly

ollama run deepseek-coder “Write factorial function in Python”

PORT OWNERSHIP WARNING: Only One Process Per Port

This is critical: Only one process can bind to a port at a time. If you’re running multiple services, you must:

Use different ports for each service
OR use reverse proxy configuration
OR stop one service before starting another

Common port conflicts:

Jupyter Lab: 8888
Ollama: 11434
OpenWebUI: 8080

Solution: Map services to different ports or use a reverse proxy.

OpenWebUI Configuration with Port Mapping

I got OpenWebUI working but it required careful port management:

Install OpenWebUI

curl -fsSL https://openwebui.com/install.sh | bash

But you must modify the port if other services are running

Edit the docker-compose or systemd service to use a different port

Example: Change OpenWebUI to port 8081

In docker-compose.yml or service configuration:

ports:

“8081:8080”

Then access at http://your-ip:8081

The key is ensuring no port conflicts between Jupyter, Ollama, and OpenWebUI.

Network Configuration: Base IP Setting

Critical for Jupyter AI: You MUST set the base IP in configuration:

~/.jupyter/jupyter_server_config.py

c.ServerApp.ip = ‘0.0.0.0’ # Listen on all interfaces
c.ServerApp.allow_origin = ‘*’ # Allow cross-origin
c.ServerApp.base_url = ‘/your-prefix’ # If behind reverse proxy

For Ollama integration

c.AiHandler.model_provider = “ollama”
c.AiHandler.model_name = “deepseek-coder:latest”

Why this matters: Jupyter AI makes internal API calls that fail if IP binding isn’t correct.

CURRENT LIMITATION: /generate COMMAND NOT WORKING

Status: The /generate command fails with Pydantic validation errors despite everything else working.

Error Pattern:
ValidationError: 1 validation error for Outline
Input should be a valid dictionary or instance of Outline

Workaround: Use direct code generation:

Instead of /generate, use:

%%ai
Write a notebook about machine learning basics

Or use manual implementation:
def manual_generate(prompt):

Your custom implementation here

return generated_code

AI-POWERED TROUBLESHOOTING METHODOLOGY

When you hit walls, use AI to help troubleshoot:

Error Analysis with AI
error_message = “”“Your full error message here”“”
prompt = f"““Analyze this error and suggest RISC-V specific solutions: {error_message}””"
response = ask_ollama(prompt)
print(response)
Dependency Resolution
prompt = “”"Resolve these conflicting dependencies for RISC-V:

package-a==1.2 requires libxyz>=2.0
package-b==3.4 requires libxyz<=1.8
Suggest specific version combinations that might work.“”"

Build Optimization
prompt = “”“Optimize this CMake command for RISC-V with 32GB RAM:
cmake .. -DFAISS_ENABLE_PYTHON=ON -DCMAKE_BUILD_TYPE=Release
Suggest flags for our architecture.”“”

SUCCESSFUL WORKING COMPONENTS

What DEFINITELY works:

FAISS vector database with similarity search
Ollama with DeepSeek-Coder model
Jupyter Lab with AI magic commands (%ai)
Direct Ollama API integration
OpenWebUI with proper port mapping
Full RISC-V compatibility

What needs work:

/generate command implementation
Some Jupyter AI advanced features

STEP-BY-STEP DEPLOYMENT GUIDE

Start with clean Debian installation
Install Rust environment first
Build FAISS from source with RISC-V flags
Install Ollama and pull DeepSeek-Coder
Install Jupyter AI within Jupyter shell
EXIT AND RUN: jupyter lab build
Configure network settings properly
Set up OpenWebUI with unique port mapping
Test with direct API calls before UI

KEY LESSONS LEARNED

Build order matters - Rust first, then Python, then Jupyter
Memory management is better with 32GB but still needs care
Port management is critical - only one process per port
The jupyter lab build step after installation is mandatory
Direct API calls are more reliable than complex UI frameworks
AI-assisted troubleshooting is incredibly powerful

NEXT STEPS

Fix /generate command validation issues
Experiment with larger models (thanks to 32GB RAM)
Develop RISC-V optimized build scripts
Create community documentation
Explore multi-model deployments with Ollama

Hardware Used: Milk-V Megrez (32GB)
Model: DeepSeek-Coder:latest
Performance: Good for development with 32GB RAM
Stability: Solid once configured properly

This setup proves that serious AI development is possible on RISC-V hardware! The extra 32GB RAM makes a significant difference in what we can deploy.

Let’s keep pushing the boundaries of what’s possible on RISC-V!

Feel free to reach out with questions or share your own experiences. Together we can make RISC-V a first-class citizen in the AI ecosystem.

teknoraver · February 6, 2026, 10:44am

Did you need to reserve some extra RAM for the NPU to run DeepSeek?

teknoraver · February 6, 2026, 12:11pm

I think it’s

--- a/arch/riscv/boot/dts/eswin/eic7700-milkv-megrez.dts
+++ b/arch/riscv/boot/dts/eswin/eic7700-milkv-megrez.dts
@@ -116,7 +116,7 @@ g2d_12GB_boundary_reserved_4k {

                mmz_nid_0_part_0 {
                        compatible = "eswin-reserve-memory";
-                       reg = <0x3 0x0 0x1 0x80000000>;
+                       reg = <0x3 0x0 0x05 0x00>;
                        no-map;
                };
        };