Friday, 15 July 2011

Fixedpoint math with Newton-Raphson and the Taylor-series [Part 1]

When working with embedded platforms like ARM and AVR, you might not have a floating-point unit. And you might lack support for division in hardware. Floating-point emulation is slow, and so are most C-runtime's implementations for division. So what do you do? Use integers of course! I assume my readers already know the basics of fixedpoint, like how you add, subtract, multiply and divide with fixedpoint, and how you convert between different bases. What I know a lot of people struggle with is to implement fixedpoint versions of log2, ln, pow, sqrt, cos, sin and division. While I make little mention on performance, I'll try to cover general algorithms which can be made very efficient. Especially if you can make a trade-off between memory usage, performance and accuracy.

Before we can implement our math functions, I need to explain the general idea behind Newton-Raphson. So what is Newton-Raphson anyway?

Newton-Raphson is an algorithm for iteratively solving the roots of a function. The function's input is its output, such that the accuracy of your answer increases quadratically with the number of iterations. The initial input is a guess or approximation. The general algorithm is:

X[n+1] = X[n] - f(X[n]) / f'(X[n])

Where f is the root function, f' is the derivative of the root function, X[n] is the value from the last iteration, and X[n+1] is the value for the next iteration. When you have iterated enough, X[n+1] is also the answer. In C-code:

unsigned newton_raphson_unary(
    unsigned int val,
    void (*f)(unsigned),
    void (*fd)(unsigned)
)
{
    for(unsigned int i=0; i<5; ++i)
        val = val - f(val) / fd(val); 
    return val;
}
But what does the root-function contain? It's very simple. Remember from highschool when you learned that the two answers for a quadratic equation ax² + bx + c is where the parabola intersected the x-axis? That's exactly what the root function is. The exact answer is where the function intersects the x-axis, because we solve for y = f(x), where y = 0. For every iteration the result of the root function gets closer and closer to 0. In fact, the accuracy growth is quadratic, and the algorithm is self-corrective. You rewrite/expand your root function such that you get rid of the function you want to find, and you put everything on the right side of the equation, so that it equals 0. For example, a possible root function to find a square root could be: x² - N = 0, where N is the input for sqrt, and X is the answer.

But we also need the derivative of the root function. Even if you hate calculus, fear not. All you need to understand are the differentiation rules [Wikipedia], and those can be looked up instead of remembered. If we keep our square root example, the derivative of the function f(x) = x² - N is f'(x) = 2x. This is due to the rule x^k = kx^(k-1). And N is a constant, so it's removed.

A full square root example below, but we have simplified the equation like this:
x = x - f(x) / f'(x)
x = x - (x² - val) / (2x)
x = (2x²/2x) - ((x² - val) / 2x)
x = (x² + val) / 2x
x = (x² / 2x) + (val / 2x)
x = (x / 2) + ((val/2) / x)

/* BASE is a constant for the fixedpoint scale */
static unsigned fp_sqrt(unsigned val)
{
    unsigned x;
    int bitpos;
    unsigned long long v;

    if(!val)
        return val;

    /* clz = count-leading-zeros. bitpos is the position of the most significant bit,
        relative to "1" or 1 << base */
    bitpos = base - clz(val);
    
    /* Calculate our first estimate.
        We use the identity 2^a * 2^a = 2^(2*a) or:
         sqrt(2^a) = 2^(a/2)
    */
    if(bitpos > 0u) /* val > 1 */
        x = (1u<<BASE)<<(bitpos >> 1u);
    else if(bitpos < 0) /* 0 < val < 1 */
        x = (1u<<BASE)<<((unsigned)(-bitpos) << 1u);
    else /* val == 1 */
        x = (1u<<BASE);
    
    /* We need to scale val with base due to the division.
       Also val /= 2, hence the subtraction of one*/
    v = (unsigned long long)val <<  (BASE - 1u);

    /* The actual iteration */
    x = (x >> 1u) + v/x;
    x = (x >> 1u) + v/x;
    x = (x >> 1u) + v/x;
    x = (x >> 1u) + v/x;
    return x;
}
I hope this article was easy enough to understand. Since I've covered the basics on how to use Newton-Raphson and implemented sqrt(), I'll continue with the implementation of more math functions next time.
If anyone has corrections, criticism or questions, I'm all ears. The whole point of this post is to explain this to people who struggle with math (including myself), but are otherwise decent programmers.

Subscribe to RSS feed