Slow speed on Duo (Arduino)?

Am I doing something wrong? I seem to get pretty low speeds when using the 700mhz core in the Arduino IDE.

/*                Results
   -----------------------------------------
  |         Board           |     Time      |
  |-----------------------------------------|
  | ESP32 Dev (240mhz)      |      183      |
  | Nano Connect (150mhz)   |      334      |
  | Raspi Pico (150mhz)     |      335      |
  | Raspi Pico W (150mhz)   |      335      |
  | Nano Connect (133mhz)   |      377      |
  | Raspi Pico (133mhz)     |      378      |
  | Raspi Pico W (133mhz)   |      378      |
  | Nano Every (16mhz)      |     7412      |
  | Milkv Duo 64 (700mhz)   |     8073      |
   -----------------------------------------
*/

#include <Arduino.h>

unsigned long startTime = 0;
unsigned long endTime = 0;
unsigned long elapsedTime = 0;
unsigned long fibonacciValue = 0;

void setup() {
  Serial.begin(115200);
}

void loop() {
  startTime = 0;
  endTime = 0;
  elapsedTime = 0;
  startTime = micros();
  yourFunction();
  endTime = micros();
  elapsedTime = endTime - startTime;
  Serial.print("Elapsed Time (microseconds): ");
  Serial.println(elapsedTime);
  delay(10000);
}

void yourFunction() {
  for (volatile int i = 0; i < 20; ++i) {
    fibonacciValue = 0;
    fibonacciValue = fibonacci(10);
  }
}

unsigned long fibonacci(int n) {
  if (n <= 1) {
    return n;
  } else {
    return fibonacci(n - 1) + fibonacci(n - 2);
  }
}

Likewise,

I used the GPIO toggle arduino sketch (removed the delays) and it can only manage a frequncy if about 14Khz

Is there something secret to make it run at a better speed?

One of the reason maybe that by default the I-Cache is disabled.

See chapter 16.3 of https://occ-intl-prod.oss-ap-southeast-1.aliyuncs.com/resource/XuanTie-OpenC906-UserManual.pdf for how to enable it in assembly. I don’t know how to do it in Arduino IDE.

1 Like

It’s also possible that micros() doesn’t have the resolution you would expect. Could you run yourFunction like 100 times, and then divide the timing result by 100 to see if this makes a difference?

Arduino Code to enable D and I Cache… just for fun. I didn’t benchmark it yet as I need to attach a serial port :smiley: - and that is tomorrows job

It impacted my blinky - more details follow.

void setup() {
    Serial.begin(115200);
    Serial.println("Starting");

#if 1
  asm volatile(
          "li x3, 0x33\r\n"
          "csrc 0x7c2, x3\r\n"
          "li x3, 0x11\r\n"
          "csrs 0x7c2, x3\r\n"
          "li x3, 0x1\r\n"
          "csrs 0x7c1, x3\r\n"

          "li x3, 0x33\r\n"
          "csrc 0x7c2, x3\r\n"
          "li x3, 0x12\r\n"
          "csrs 0x7c2, x3\r\n"
          "li x3, 0x2\r\n"
          "csrs 0x7c1, x3\r\n"
  );
#endif
  // initialize digital pin LED_BUILTIN as an output.
  pinMode(LED_BUILTIN, OUTPUT);
}

With D and I cache enabled, GPIO went from 14Khz to 500Khz

its still slow… what is the limit of the internal bus?

EDIT:

@niek - I cant reply more as the forum only allows 3 replies from noobies :smiley:
@Mando_Rick Question for you… did the new Arduino cache code work for you?

Did you try my suggestion and see if it makes a difference?

#define TEST_PIN 20  

void setup() {
  pinMode(TEST_PIN, OUTPUT);
  asm volatile(
        "li x3, 0x33\r\n"
        "csrc 0x7c2, x3\r\n"
        "li x3, 0x11\r\n"
        "csrs 0x7c2, x3\r\n"
        "li x3, 0x1\r\n"
        "csrs 0x7c1, x3\r\n"

        "li x3, 0x33\r\n"
        "csrc 0x7c2, x3\r\n"
        "li x3, 0x12\r\n"
        "csrs 0x7c2, x3\r\n"
        "li x3, 0x2\r\n"
        "csrs 0x7c1, x3\r\n"
  );
}

void loop() {
  digitalWrite(TEST_PIN, HIGH);   
  digitalWrite(TEST_PIN, LOW);  
  
}

Allows me to measure a pulsetrain on pin 20 of 700 kHz. A bit more than 500 kHz. but still remarkably little considering it is a 700 MHz clock - if I understood it correctly:

Do you have any idea how to make it work faster, the little dude?

1 Like

Oh. I wasn’t aware that there was a cap on replies. That is weird. Anyway. If you want to talk to me, you can reach me at morten at winkler dot 🇩 🇰

I want MORE speed out of the little core

Shame, I wanted to see where this was headed, thx for the interesting replies.

Still working on it … I would like to see an ideal 2 or so Mhz (the more the merrier)
My next test would be using the low level RTOS software to see if I can manipulate the register directly… (from the Arduino code, it doesnt that over 4+ function layers.

1 Like

On my Duo-S, with code in rust, according to my cheap 24Mhz logic analyzer, the absolute fastest I could achieve is 6.0Mhz.

The code for the loop is the following:

    loop {
        unsafe {
            gpio1.dr().write(|w| w.bits(1 << 14));
            gpio1.dr().write(|w| w.bits(0));
        }
    }

3 Likes

I get 2.9Mhz on the below code… It toggles pin 20 (GP15) - I needed to add the NOP before the toggles happened (ADDI instruction)


#define GPIO2 0x03022000
#define GPIO0 0x03020000


#define GPIO_SWPORT_DR 0x000
#define GPIO_SWPORT_DDR 0x004
#define PIN 15

// the setup function runs once when you press reset or power the board
void setup() {
  // initialize digital pin LED_BUILTIN as an output.
  *(uint32_t*)(GPIO0|GPIO_SWPORT_DDR) = 1 << PIN;
 
  asm volatile(
        "li x3, 0x33\r\n"
        "csrc 0x7c2, x3\r\n"
        "li x3, 0x11\r\n"
        "csrs 0x7c2, x3\r\n"
        "li x3, 0x1\r\n"
        "csrs 0x7c1, x3\r\n"

        "li x3, 0x33\r\n"
        "csrc 0x7c2, x3\r\n"
        "li x3, 0x12\r\n"
        "csrs 0x7c2, x3\r\n"
        "li x3, 0x2\r\n"
        "csrs 0x7c1, x3\r\n"
  );
}

// the loop function runs over and over again forever
void loop() {
    *(uint32_t*)(GPIO0|GPIO_SWPORT_DR) = 1 << PIN;
    asm volatile (
       "ADDI x0, x1, 0\r\n"
    );
    
    *(uint32_t*)(GPIO0|GPIO_SWPORT_DR) = 0;
    asm volatile (
       "ADDI x0, x1, 0\r\n"
    );
}

2 Likes

Has anyone done some benchmark using the RTOS of the SDK, so that we have something to compare to?

1 Like