After facing increasing difficulty building the Sophgo 6.6 fork with a modern toolchain, I decided to try out the latest upstream effort from @unicornx seeing as the final major piece, PCIe, is supposedly ready to go. I built the kernel using the defconfigand enabled all the SG2042-related drivers I could find:
Using the latest Sophgo ZSBL, upstream OpenSBI, and sg2042-milkv-pioneer.dtb from the Linux sources, the kernel boots, but consistently hangs initilaizing the third PCIe RC. It always hangs after “Link up” and does not respond to any of the watchdog settings I applied or SysRq triggers. If I disable that specific controller in the device tree, it completes booting, but of course, there’s no NVME or other devices that sit behind that controller. I’ve exhausted all of my debugging experience, so I just wanted to see if anyone here has gotten it to work.
Thank you. I didn’t see anything in Revy’s fork that wasn’t in your 6.18-rc branch, but I tried it just in case and it hangs at the exact same spot immediately after initializing the last pci controller:
[ 2.457755] sg2042-pcie 7062800000.pcie: host bridge /soc/pcie@7062800000 ranges:
[ 2.465438] sg2042-pcie 7062800000.pcie: IO 0x4cc0c00000..0x4cc0ffffff -> 0x0000000000
[ 2.474041] sg2042-pcie 7062800000.pcie: MEM 0x4cf8000000..0x4cfbffffff -> 0x00f8000000
[ 2.482615] sg2042-pcie 7062800000.pcie: MEM 0x4cfc000000..0x4cffffffff -> 0x00fc000000
[ 2.491178] sg2042-pcie 7062800000.pcie: MEM 0x4e00000000..0x4fffffffff -> 0x4e00000000
[ 2.499739] sg2042-pcie 7062800000.pcie: MEM 0x4d00000000..0x4dffffffff -> 0x4d00000000
[ 2.508314] sg2042-pcie 7062800000.pcie: Memory resource size exceeds max for 32 bits
[ 2.516289] sg2042-pcie 7062800000.pcie: no "phy-names" property found; PHY will not be initialized
[ 2.525509] sg2042-pcie 7062800000.pcie: Link up
<hang>
I did a little more debugging and found the hang to happen in the cdns_pcie_host_init_address_translation (drivers/pci/controller/cadence/pcie-cadence-host.c) function at:
Interestingly, this function has not changed between 6.6 (working) and 6.18 (not working) so I’m digging through other changes to try to understand what is influencing this new, faulty behavior.