[Ed. Note: Originally published by Mark Featherston on April 27th, 2012]
In our whitepaper on meeting real-time requirements we address "User Space Drivers with Real-time Priority". As we have many userspace drivers we have done some testing to see what kind of IRQ dispatch latency you can expect from userspace. The result is not far from what you would receive from a kernel driver. In this case I'm using a TS-4200 with no modifications from our stock image aside from the code below.
To measure the IRQ latency I'm feeding a square wave to an interrupt (PC14_IRQ2) and in code responding to the IRQ from userspace and setting an output (PC0) low. It then busy waits by reading the input value of PC14_IRQ2 and sets PC0 high as soon as the square wave goes low again. Essentially this just inverts the square wave. The code I'm using is below which can be used to replicate the test.
Code:
#include <stdio.h> #include <stdint.h> #include <fcntl.h> #include <sys/select.h> #include <sys/stat.h> #include <sys/mman.h> #include <unistd.h> int main(int argc, char **argv) { int ret, irqfd = 0, buf, mem, x, i; fd_set fds; volatile uint16_t *syscon; volatile uint32_t *pioc; irqfd = open("/proc/irq/31/irq", O_RDONLY| O_NONBLOCK, S_IREAD); mem = open("/dev/mem", O_RDWR|O_SYNC); syscon = mmap(0, getpagesize(), PROT_READ|PROT_WRITE, MAP_SHARED, mem, 0x30000000); pioc = mmap(0, getpagesize(), PROT_READ|PROT_WRITE, MAP_SHARED, mem, 0xfffff000); // Turn off DIO pioc[0x810/4] = 0x1; pioc[0x834/4] = 0x1; while(1) { // Block until the IRQ triggers FD_SET(irqfd, &fds); ret = select(irqfd + 1, &fds, NULL, NULL, NULL); if(FD_ISSET(irqfd, &fds)) { // Enable output immediately after the interrupt pioc[0x830/4] = 0x1; // Busy wait while reading the input PC0 while(!(pioc[0x83c/4] & (1 << 14))) {}; // Clear PC0 output pioc[0x834/4] = 0x1; FD_CLR(irqfd, &fds); read(irqfd, &buf, sizeof(buf)); } } return 0; }
This is compiled using gcc and run in a realtime priority. You can set the priority in code, but it is also simple to use 'chrt -f -p 99 <yourpid>' which will set it above all other processes.
On the IRQ latency test I receive about 11µs best case, about 18µs typical, and about 50µs worst case. In the below image green is the PC0 output userspace is controlling, and orange is the square wave. Each vertical division in the grid is 20µs.
Now for a much better typical/best base we can busy wait. This however ties up the CPU to 100% while it is busy waiting. The worst case in this picture is actually about the same as the IRQ at about 50µs. This is likely the same kernel function/driver affecting the worst case for the IRQ latency test. For this test green is still PC0, and orange is still the square wave. The vertical divisions are still 20µs.
The best case with busy waiting is much better. In this image each vertical division is 500ns. The best case is about 200ns for switching the output based on polling from an input. The typical was about 250ns.