Guide: Building PyTorch 2.9.0a0 on RISC-V (Debian/RockOS)

dj53144 · September 3, 2025, 8:26am

Guide: Building PyTorch 2.9.0a0 on RISC-V (Debian/RockOS) for CPU-Only Use

Hello Megrez community,

I recently built PyTorch 2.9.0a0+git9a665ca from source on a RISC-V system running a Debian-based RockOS distribution, targeting a CPU-only setup with Python 3.11. This procesGuide: Building PyTorch 2.9.0a0 on RISC-V (Debian/RockOS)s was complex and time-consuming, but I’ve successfully installed and tested PyTorch, achieving functional tensor operations. Below is a detailed step-by-step guide to help you replicate this on your RISC-V system, along with warnings about the challenges and tips for using AI to troubleshoot errors.

Warning: Complexity and Time Involved

Complexity: Building PyTorch on RISC-V is not straightforward. It involves cross-compilation (or native compilation), managing dependencies, configuring CMake, and handling Python packaging. You’ll need familiarity with Linux, compilers, and CMake.
Time: Compilation can take several hours (4–10 hours depending on hardware), especially for cross-compilation. Ensure you have ~10 GiB of disk space and 16 GiB+ of RAM to avoid crashes.
Potential Issues: Common pitfalls include compiler mismatches, CMake generator conflicts (Ninja vs. Unix Makefiles), and Python module resolution errors. Patience and careful debugging are essential.

Prerequisites

Before starting, ensure you have:

A Debian-based system (e.g., RockOS or standard Debian).
At least 10 GiB free disk space (df -h) and 16 GiB RAM (free -h).
Python 3.11 in a virtual environment (I used pyenv).
Administrative privileges for installing dependencies.

Step-by-Step Guide

Follow these steps to build and install PyTorch 2.9.0a0+git9a665ca on RISC-V for CPU-only use.

Step 1: Set Up the Environment

Install Dependencies:
Install required libraries and tools:
```
sudo apt update
sudo apt install git cmake ninja-build build-essential libopenblas-dev liblapack-dev libprotobuf-dev protobuf-compiler python3-dev python3-pip
```
- libopenblas-dev and liblapack-dev enable optimized matrix operations.
- cmake (version ≥3.27) is critical for configuration.

Set Up Python Virtual Environment:
I used pyenv to manage Python 3.11:

curl https://pyenv.run | bash
echo 'export PATH="$HOME/.pyenv/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init --path)"' >> ~/.bashrc
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
source ~/.bashrc
pyenv install 3.11.9
pyenv virtualenv 3.11.9 pytorch-build-env
pyenv activate pytorch-build-env
pip install --upgrade pip
pip install numpy wheel

Step 2: Install RISC-V Toolchain

For RISC-V cross-compilation, install the RISC-V GNU toolchain:

sudo apt install gcc-riscv64-linux-gnu g++-riscv64-linux-gnu

Verify the compilers:

/usr/bin/riscv64-linux-gnu-gcc --version
/usr/bin/riscv64-linux-gnu-g++ --version

Note: On my RockOS system, the compilers were named riscv64-linux-gnu-gcc and riscv64-linux-gnu-g++ (not riscv64-unknown-linux-gnu-gcc). Check your binary names:

ls /usr/bin/*riscv64*linux-gnu*

Step 3: Clone PyTorch Source

Clone the PyTorch repository at the specific commit I used:

git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
git checkout 9a665ca

Step 4: Configure and Build PyTorch

Create Build Directory:
```
mkdir build
cd build
```

Configure CMake:
Use the following CMake command to configure a CPU-only build for RISC-V:

cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DPYTHON_EXECUTABLE=$(which python) \
    -DUSE_CUDA=OFF \
    -DUSE_ROCM=OFF \
    -DUSE_NNPACK=OFF \
    -DUSE_QNNPACK=OFF \
    -DUSE_PYTORCH_QNNPACK=OFF \
    -DUSE_CUDNN=OFF \
    -DUSE_FBGEMM=OFF \
    -DUSE_KINETO=OFF \
    -DUSE_NUMPY=ON \
    -DUSE_OPENMP=ON \
    -DUSE_SYSTEM_BLAS=ON \
    -DUSE_SYSTEM_LAPACK=ON \
    -DBUILD_TEST=OFF \
    -DBUILD_SHARED_LIBS=ON \
    -DCMAKE_C_COMPILER=/usr/bin/riscv64-linux-gnu-gcc \
    -DCMAKE_CXX_COMPILER=/usr/bin/riscv64-linux-gnu-g++ \
    -DCMAKE_POLICY_DEFAULT_CMP0126=NEW \
    -DUSE_NCCL=OFF \
    -DBUILD_PYTHON=True

Adjust compiler paths if your binaries differ (e.g., /usr/bin/riscv64-linux-gnu-gcc-14).
-DUSE_NCCL=OFF prevents unnecessary GPU library cloning.
-DBUILD_TEST=OFF skips tests to save time.

Build PyTorch:
```
make -j$(nproc)
```
Warning: This step is time-consuming (hours). Monitor memory (free -h) and reduce parallelism (e.g., make -j4) if memory is low.
Verify Build Completion:
Look for [100%] Built target functorch in the output. Check artifacts:
```
ls lib/
```
You should see libtorch.so, libtorch_cpu.so, etc.

Step 5: Create and Install the Python Wheel

Clean CMake Cache (to avoid Ninja conflicts):

cd ~/pytorch/build
rm -rf CMakeCache.txt CMakeFiles

Build the Wheel:

cd ~/pytorch
python setup.py bdist_wheel

If setup.py tries to use Ninja or unwanted settings, use:

PYTHON_CMAKE_FLAGS="\
-DCMAKE_BUILD_TYPE=Release \
-DPYTHON_EXECUTABLE=$(which python) \
-DUSE_CUDA=OFF \
-DUSE_ROCM=OFF \
-DUSE_NNPACK=OFF \
-DUSE_QNNPACK=OFF \
-DUSE_PYTORCH_QNNPACK=OFF \
-DUSE_CUDNN=OFF \
-DUSE_FBGEMM=OFF \
-DUSE_KINETO=OFF \
-DUSE_NUMPY=ON \
-DUSE_OPENMP=ON \
-DUSE_SYSTEM_BLAS=ON \
-DUSE_SYSTEM_LAPACK=ON \
-DBUILD_TEST=OFF \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_C_COMPILER=/usr/bin/riscv64-linux-gnu-gcc \
-DCMAKE_CXX_COMPILER=/usr/bin/riscv64-linux-gnu-g++ \
-CMAKE_POLICY_DEFAULT_CMP0126=NEW \
-DUSE_NCCL=OFF \
-DBUILD_PYTHON=True \
-G'Unix Makefiles'" python setup.py bdist_wheel

Install the Wheel:

ls dist/
pip install dist/torch-2.9.0a0+git9a665ca-*.whl

Step 6: Verify Installation

Run tests from outside the pytorch directory to avoid source conflicts:

cd ~
python -c "import torch; print(torch.__version__); print(torch.__file__)"

Expected output:

2.9.0a0+git9a665ca
/home/<user>/.pyenv/versions/pytorch-build-env/lib/python3.11/site-packages/torch/__init__.py

Test tensor operations:

python -c "import torch; print(torch.randn(2, 3)); print(torch.cuda.is_available())"

Expected: A random 2x3 tensor and False for CUDA.
Test BLAS performance:

python -c "import torch; a = torch.randn(1000, 1000); b = torch.randn(1000, 1000); print((a @ b).sum())"

Common Issues and Fixes

Ninja vs. Unix Makefiles Mismatch:
If setup.py fails with CMake Error: Error: generator : Ninja, clean the CMake cache (rm -rf build/CMakeCache.txt build/CMakeFiles) and use PYTHON_CMAKE_FLAGS with -G'Unix Makefiles'.
Compiler Not Found:
If CMake reports CMAKE_C_COMPILER not set, verify your toolchain:
```
ls /usr/bin/*riscv64*linux-gnu*
```
Update -DCMAKE_C_COMPILER and -DCMAKE_CXX_COMPILER paths accordingly.
Source Directory Conflict:
If you see ImportError: Failed to load PyTorch C extensions, avoid running Python from ~/pytorch. Use cd ~ first.
Memory Issues:
Monitor memory (free -h). If the build crashes, reduce parallelism (make -j4).

Tips for Using AI to Troubleshoot

I used an AI assistant (Grok) to debug issues, and it was a lifesaver. Here’s how to leverage AI effectively:

Provide Full Error Output: Copy-paste the entire error message (e.g., CMake errors, Python tracebacks) into your AI query. This helps pinpoint the issue.
- Example: Share CMake Error: Error: generator : Ninja with the command you ran.
Include Context: Mention your system (RISC-V, RockOS), PyTorch version (2.9.0a0+git9a665ca), and whether you’re cross-compiling or building natively.
Ask for Specific Fixes: Request step-by-step solutions for errors, like “How do I fix a Ninja mismatch in CMake?” or “Why is my compiler not found?”
Iterate with Follow-Ups: If the AI’s suggestion fails, provide the new error output and ask for clarification. For example, I shared compiler check outputs (ls /usr/bin/*riscv64*) to fix a toolchain issue.
Verify AI Suggestions: Cross-check AI advice with PyTorch’s official docs or GitHub issues to ensure accuracy.

Final Notes

Building PyTorch on RISC-V is a significant undertaking, but it’s rewarding to get it running. My setup was cross-compiled for RISC-V using riscv64-linux-gnu-gcc/g++ on a RockOS system. If you’re building natively, you may need gcc and g++ instead. The wheel file (dist/torch-2.9.0a0+git9a665ca-*.whl) is portable to other RISC-V systems with Python 3.11 and compatible libraries.

If you hit errors, share them in this thread with:

Full error output.
Your CMake command or setup.py invocation.
Output of ls /usr/bin/*riscv64*linux-gnu* and df -h.

Good luck, and happy coding with PyTorch on RISC-V!