Boot Problems going from BAD to WORSE

I’ve been having successively worse problems getting this board to boot. When I first received it, it booted right up and worked nearly on the first try. However, over the next few days, booting started becoming intermittent, and less and less reliable. I should say, however, that when the board did boot successfully, the system would run reliably for as long as I wanted.

The first symptoms, visible on the mcu serial debug port, indicated a timeout reading from the SD card. These became more and more frequent. Eventually, it would refuse to even display that (or anything at all), with the SD card inserted. If I took out the SD card, it would attempt to boot from the on-board SPI flash (apparently successful, but with nowhere to go).

Now, today, even that is failing. When I power on the board, I get the following message (from the risc-v serial debug port), and then the board abruptly turns itself off. I have removed all peripherals and PCI devices from the board, and even tried single sticks of RAM. This is very concerning.

SOPHGO ZSBL
sg2042:v0.2

sg2042 work in single socket mode
chip0 ddr info: raw data=0x29292905, 
    ddr0 size:0x800000000
    ddr1 size:0x0
    ddr2 size:0x0
    ddr3 size:0x0
conf.ini should start with "[sophgo-config]"
rv boot from spi flash
load fw_jump.bin image from sf 0x633871 to memory 0x0 size 269984
load riscv64_Image image from sf 0x675711 to memory 0x2000000 size 22089216

(power automatically cuts). I have to pull the AC plug and wait 30+ seconds, then plug it back in, before it will power back on.

This feels like an on-board power issue. Like maybe some voltage is going out of spec and the mcu is cutting off power for safety. The fact I have to wait for a while with the AC unplugged makes it feel like some capacitor with too high a voltage needs to slowly discharge…

Any ideas? I’m gonna leave it off for a few hours and reach out to Milk-V.

So, with no RAM installed, the board does not automatically power itself off. So that leads me to believe that something in the risc-v boot code (perhaps detecting a problem) is turning it off, rather than the mcu or a53.

Well now it’s booting, even from the SD card! I didn’t really change anything…
Is this board possessed? :ghost:

1 Like

Ok guys, maybe ignore the above for now. I think it could be my power supply. I hooked up an oscilloscope to the +12V, +5V and +3.3V rails, and I see quite some noise on all 3 rails. I am going to try a beefier power supply and, if necessary, add some high-quality low ESR caps on the motherboard end of the plug.

Just confirming you have the v1.3 board just recently shipped?

And you say without an SDCARD it loaded ZSBL from the SPI flash?

Correct, and yes, v1.3 board.

Does anyone know the current requirements for each rail of this Pioneer board? The supply I’m using is a 500W unit with the following limits:
+3.3V: 25A
+5V: 20A
+12V: 38A
-12V: 0.8A
+5Vsb: 3A
(Note: +3.3V and +5V supplies cannot exceed 120W combined… maybe this is the problem?)

I was wondering the same thing because the sg2042 manual said that https://github.com/milkv-pioneer/pioneer-files/blob/main/hardware/SG2042-TRM.pdf. It’s good to know that it does. At least I can rule out the SDCARD then. But even when pulling the SDCARD, I never get any output from the ZSBL on the RV serial console.

I can’t find the exact PSU that shipped with the Pioneer system, but this looks to be the closest:

My current PSU exceeds those ratings. I’m still not sure it’s the power supply, but I ordered a new one anyway. I don’t think it’s temperature related. I did a whole series of kernel compiles using -j63 and monitored temps. Even disconnected the fan for a bit and let the CPU get up to 65 C… no problems there. Problem seems isolated to early boot / reading from SD, not normal operation after boot.

This what I have on order:

I’ll second that the issues that I experience are usually during the normal boot. I’ve noticed if the machine is power-cycled too quickly (not via a safe reboot) there is usually a likeliness that the boot process is going to crash somewhere completely random.