### Integer Maths is Magic!

Along with most Open Energy Monitor based code, my Mk2 sketch is full of floating point calculations.  Although easy to write, FP maths is notoriously slow.  I decided to see whether the operation could be speeded up by the use of integer maths.  By 'integer maths', I mean that everything is done with int, long and bit-shiting (>> or <<), rather than float or double.

The attached Word document describes my method and findings.  In short, integer maths is a great deal faster than its FP equivalent and works just as well.  Graphs are presented to show the equivalent performance, and the resulting speed benefits are quantified.

At a rough guess, I would say that the total amount of processing time has been halved.  Much of this gain has been achieved by rewriting the standard HPF filters as suggested by Atmel.  I've also managed to reduce Atmel's implementation by one processing statement.  This saves a small amount of time when processing every V & I sample.

A pair of 'core' sketches are attached, these being chopped-down versions of the two sketches with which my measurements were taken.  Although they compile, they are not intended to be run on an Arduino.  Their sole purpose is to convey how the code has been altered to achieve integer maths operation.  [Sorry, there are a few mistakes in the comments - just follow the code!]

A similar upgrade could no doubt be usefully applied to many of the sketches that we all use.  I look forward to hearing about lots of go-faster code before too long ...

### Re: Integer Maths is Magic!

Brilliant stuff!  I will almost certainly use this approach in my code, thanks so much for sharing.

One quick thought for further speed improvements when sampling more than one analog input (e.g. sampling voltage then current in quick succession):

The HPF in EmonLib takes about 200 microseconds to run, so I guess your version takes about 100 microseconds.  Which is a very important number because that just happens to be about the time it takes to run analogRead().

So the end result would be:

1. sample voltage (takes 100 microseconds)
2. sample current (takes 100 microseconds) AND, while waiting for the ADC, run the HPF for the voltage sample, then busy-wait until the ADC has finished.

If we're really clever then both analogRead() operations can be spent doing useful maths, further reducing the time taken to execute one loop.  e.g. the analogRead() to get the voltage can be spent running the HPF on the current sample from the previous iteration.

Just a thought.  I won't have time to tinker with this over the weekend but hopefully will do next week.

Thanks again for the integer maths stuff - looks very useful.

### Re: Integer Maths is Magic!

When using integer maths, the HPF takes much less that 100uS.  In the setup that I've described above, the entire per-loop processing is only around 50uS.

As for 'free' time during ADC conversions, try this:

ADMUX = 0x40 + currentSensorPin; // 0 to 5
{
// do something useful
delayMicroseconds(10); // to see how much time is available
}

Works a treat!

### Re: Integer Maths is Magic!

If we're really clever then both analogRead() operations can be spent doing useful maths, further reducing the time taken to execute one loop.  e.g. the analogRead() to get the voltage can be spent running the HPF on the current sample from the previous iteration.

Finding something useful to do during the first of the two ADC slots is not immediately obvious, but it seems a shame not to use free time if it's there.  My approach has been to delay the processing of both raw samples from one loop to the next.  Then there is always data to process when free time becomes available.

### Re: Integer Maths is Magic!

Why not let periodic interrupts take care of the sampling (storing the data in a Circular buffer)? This way the sampling and the calculations are independent: you get samples at a fixed period, a period which is not dependent on the calculation time, and you can even do calculations that takes a bit longer than the sampling (at least if you don't need continuous sampling and with enough of a buffer).

Circular buffers are quite easy to implement; you need an array for the buffer and two index variables, one that is updated by the writer (sample routine) and one that is updated by the reader (calculation routine). There's lots of details in the Wikipedia article in the link above.

### Re: Integer Maths is Magic!

Love this stuff!  Just hope I have time to tinker before Christmas descends upon us.

BTW, what motivated you to try to squeeze as many samples into a given time period?  My own motivation is to increase the quality of the measurements (more samples per cycle should provide more precise calculated values)

### Re: Integer Maths is Magic!

Isn't it more important that the samples are evenly spaced in time since the calculations give equal weight to each sample?

### Re: Integer Maths is Magic!

I had a play with my sketch this last weekend and increased the number of samples, per cycle by about 15%. To be honest, I'm not convinced that I made any difference to the accuracy of the readings compared to my plug in energy monitor or my enviR units.

Perhaps at low power when the waveforms are far from sinusoidal it will marginally improve accuracy. As the power increases the current waveforms err toward sinusoidal and this is easily resolved with a relatively low samples per cycle rate. IMHO anyway.

I will keep the code as the whole thing is a lot cleaner now.

### Re: Integer Maths is Magic!

Isn't it more important that the samples are evenly spaced in time since the calculations give equal weight to each sample?

http://openenergymonitor.org/emon/node/1681

### Re: Integer Maths is Magic!

Looks promising Robin but I'm surprised that you're still not using interrupts. It would be interesting to see how it compares with my interrupt/PLL based code if you get chance to try it on your rig at some point.

### Re: Integer Maths is Magic!

Well, Martin, my feeling is that if the ADC sub-processor is simply running back-to-back conversions all day long, then its conversion rate should be pretty much uniform.  I intend to knock up a sketch which checks whether the 'hidden' workload has been finished or not before the ADC operation has completed.  Then the amount of available time within ADC conversions can be better understood.

According to my TALLYMODE display, the average number of loops per mains cycle (when rounded down to the nearest integer) is always either 88 or 89 .  This is the same value that I get when simply looping around a pair of dummy analogRead() statements.  What's more, I've not seen this value reduce when Serial statements are added, which is encouraging.

On the "loops per mains cycle" scale, my free-running Mk2a code provides around 88, their spacing and regularity being entirely dependent on the performance of the ADC sub-processor.  Your interrupt-based code, as I understand it, provides exactly 50, this value being maintained by some kind of mechanism which varies the speed of the processor's clock to precisely align with the mains frequency.

Developers will no doubt make up their own minds as to which approach they prefer.  It's good that they now have a choice.

### Re: Integer Maths is Magic!

Personally, I will probably err away from using IRQs to trigger the analog sampling for one simple reason: in my system, I use a rather different RF protocol to the normal emonTX.  My emonTX sits patiently and waits to be asked for data from the base unit (and if data is lost then it can be re-sent) - the emonTX never sends RF data without being asked.

The end result is that the RFM12b on my emonTX is almost always in RX mode and could receive an RF packet at any time.  If the RFM12b receives the start of a valid packet then it'll raise an IRQ and it's REALLY important that IRQ is handled ASAP or we'll lose RF data. (the RFM12b only has a 2-byte buffer).  If the ATMEGA is handling 50 IRQs per second for analog sampling then I assume there's a fair chance that it'll be busy handling one of these IRQs when the RFM12b raises its hand asking for help (the ATMEGA disables IRQs while inside an interrupt service routine (ISR), I believe).

I could be wrong.  I guess that if the analog sampling ISR takes very little time (i.e. it just saves the previous conversion and then tells the ADC to start conversion but doesn't wait for conversion to finish) then it might be possible that, even in the worst case scenario, you'll never lose RF data.

### Re: Integer Maths is Magic!

If the ADC interrupt comes regularly at its 104 us interval (is this correct for the standard Arduino settings?) then you could use 'polling' of the RFM12B status inside the IRQ. AFAIR the RFM12B needs attention every ~200us at the nominal 49,2k data rate.

### Re: Integer Maths is Magic!

Great, thanks Jörg.  In which case, I guess the "analog-sampling-using-IRQs" might well work in my situation!

Ian wrote: "To be honest, I'm not convinced that I made any difference to the accuracy of the readings compared to my plug in energy monitor"

This is very interesting.  Does anyone have any empirical evidence to suggest how much of an accuracy improvement we might see by sampling the AC waveform as rapidly as possible (compared to the standard EmonLib technique)?  I know that posh power quality meters costing hundreds of pounds show off about sampling the AC waveform rapidly (where "rapidly" means 64 times per cycle).

### Re: Integer Maths is Magic!

Robin - I don’t want to get into another long debate, it’s just that you seem to be striving really hard to achieve uniform sampling without using the using the one mechanism that would make it easy. I don’t understand why, and you haven’t offered any reason, but that’s up to you.

Obviously you can never achieve true uniformity unless the cycle time is an exact multiple of the sampling period – hence the phase-locked loop. For me it’s all about the fun of doing something novel or clever, that’s why I liked your original concept so much.

I chose 50 samples per mains cycle because it’s a nice round number and it seemed to be plenty. It could be more, but you need to leave time for radio transmission and temperature measurement for a complete solution.

Jack – you are right that ADC ISR takes very little time. If well written it shouldn’t interfere with you RF reception at all, even if both are using interrupts. If it does become an issue then all you need to do is slow down the bit rate on the RF channel (from the 49.2kb/s that Jörg mentions) since your overall data rate is presumably pretty low. You can also speed up the ADC by changing it’s clock prescaler.

You should be able to calculate the effect of sampling rate for various wave shapes, it would certainly be useful information for the rest of use. Since it is basically integration over time I still think equal spacing of samples is significant unless the time between each sample is included in the power calculation.

### Re: Integer Maths is Magic!

Quote: Obviously you can never achieve true uniformity unless the cycle time is an exact multiple of the sampling period

This is absolutely correct. You could achieve this in Robins code by using a 'calculation interval' which is an exact multiple of the nominal mains period. For 104 us a calculation an interval of 130 ms would fit quite nicely. Not sure though if this interferes with Robins energy bucket evaluations.

But sure, using 'real' (timer) interrupts, zero crossing detection and/or a PLL would be an optimal solution. I just understand that this might be too far away from 'standard Arduino' programming style (if there is such a thing :-) )

BR, Jörg.

### Re: Integer Maths is Magic!

MartinR: "...hence the phase-locked loop..."

Oooh, you're using a PLL!? Awesome!  Is your code available online?  I had a look at your 3-phase thread but I couldn't see any PLL stuff in the code.

### Re: Integer Maths is Magic!

Robin - I don’t want to get into another long debate, it’s just that you seem to be striving really hard to achieve uniform sampling without using the using the one mechanism that would make it easy. I don’t understand why, and you haven’t offered any reason, but that’s up to you.

Martin - neither do I.  I just know that my latest version of the Mk2 code is a lot slicker than it was before, and I would therefore like it to be available for people to use.  But if anyone prefers to go down the alternative route of using interrupts, that's fine, and I'm sure you would be happy to help them along the way.  Incidentally, is there an idiot's guide as to how your phase-locking / interrupt mechanisim actually works, because I can't follow your code :(

From the various results that I took over the summer with a disc-style meter attached, it could be seen that my original Mk2 design works pretty well.   Not too many micro-pennies appeared to be slipping away.  Mk2a can only work better.  If you can demonstrate that an interrupt-based approach can work better still, I will be most impressed!

### Re: Integer Maths is Magic!

Jack – my PLL code is here: http://openenergymonitor.org/emon/node/1535

Robin – I take your point. You’ve put a lot of effort into making your code easy to read and understand and that has made it very accessible to everyone, which is a good thing. I’m probably too lazy to do that so it’s going to be a much narrower target audience (maybe like Jack?). The concept is pretty simple. You want to divide the mains cycle into equal sized chunks. As Jörg says you could do this by using one of the ATmega hardware timers with a period that divides into 20ms but this will never be exact. The phase-locked loop improves on this by constantly adjusting the timer period so that an exact multiple fits into the current mains period. I’m not even sure it is better but it’s more fun to play with :)

### Re: Integer Maths is Magic!

my original Mk2 design works pretty well.   Not too many micro-pennies appeared to be slipping away.

Part of me is extremely excited to try to implement some of the fun ideas flying around.  But the pragmatic part of my brain - the part that's well aware that I have a limited amount of time to tinker - is starting to question whether there's much accuracy to be gained from rapid sampling.  Robin, if your original Mk2 design didn't let many micro-pennies slip away, do you yet have evidence that there are real performance benefits (in terms of measurement accuracy) to be gained from rapid sampling (apart from the enjoyment to be gained from the engineering challenges, of course)?

Given the aim of trying to improve measurement accuracy, I wonder if time might be better spent, for example, experimenting with CT clamps which are well matched to the load (and possibly using 2 CT clamps) in order to increase accuracy.

### Re: Integer Maths is Magic!

I think the 'big' step with Robins new code is that the sampling rate is no longer depending so much on 'other' things that have to be done apart from sampling and calculating (like serial output). And the code is very easily readable and understandable and follows 'Arduino style' very much.

Switching to interrupts would be a next (maybe smaller) step to attain perfection. Biggest impact would be that any remaining processing time could be very easily 'used' in the main loop without any thoughts about distracting the exact adc sampling. But the code will clearly look strange to many people with Arduino experience.

Using a fixed sampling period (like a 100 us timer interrupt instead of 104 us advc interrupt) that fits exactly to the nominal mains period at 50Hz would then be the next (even smaller) step (mains frequency long term stability is extremely high, and short term stability is just 'good enough').

The PLL, if properly done, can give perfectly equidistant sampling which fits perfectly to every mains period (although it works with a presumption about the next mains period based on measuring the last, which might not be correct for fast mains period variations and especially distortion). The gained effect in accuracy may be small compared to the other steps described.

### Re: Integer Maths is Magic!

MartinR wrote: "You should be able to calculate the effect of sampling rate for various wave shapes, it would certainly be useful information for the rest of us"

I've made a start on comparing the effect on RMS current calculations of sampling current at 17 samples-per-cycle versus 45 samples-per-cycle for a number of appliances.  To get up to 45 samples-per-cycle I haven't used any of the clever tricks discussed above; I just stripped the code right back to the bare minimum required to produce an RMS current calculation (i.e. don't sample voltage).

### Re: Integer Maths is Magic!

There seems to be an assumption here that accurate measurements can only be taken if the sampling process is phase-locked to the waveform that is being measured.  I don't understand why this should be so.  The only important thing, surely, is to sample at regular intervals.

Although my Mk2 code had a higher workload during one half than the cycle than the other, that is not the case for the latest version.  Providing that the measurement system is linear, regular sampling should surely be sufficient to achieve nigh-on perfect results.  We're way above the Nyquist frequency for 50Hz, so should be seeing power at quite a few of the lower harmonics as well as the fundamental.

Jorg has touched on the difficulty of tracking frequency and amplitude distortions when a PLL is used.  Obtaining precisely 50 samples per cycle sounds less tricky to me than ensuring that they are all equally spaced.  Any timing jitter within the control loop would seem to negate any benefits that may be gained with a phase-locked sampling algorithm.

Final point: Given that the waveforms that we're monitoring are often far from sinusoidal, might it not be better to sample them asynchronously?  Then any odd parts of the waveform will be blended out in the measuring process.  The current profile from our PV inverter, as posted again below, is far from sinusoidal.  I do hope that someone else will be able to supply an equivalent trace at some stage.

```cycleCount 152,  samplesDuringThisMainsCycle 68
|                                     p .                                       |
|                                      c. p                                     |
|                                       c    p                                  |
|                                       .c      p                               |
|                                       . c       p                             |
|                                       .   c        p                          |
|                                       .    c         p                        |
|                                       .      c          p                     |
|                                       .       c           p                   |
|                                       .        c            p                 |
|                                       .       c                p              |
|                                       .       c                  p            |
|                                       .       c                   p           |
|                                       .        c                   p          |
|                                       .         c                   p         |
|                                       .           c                  p        |
|                                       .            c                 p        |
|                                       .             c                p        |
|                                       .             c                p        |
|                                       .           c                  p        |
|                                       .         c                    p        |
|                                       .       c                      p        |
|                                       .     c                      p          |
|                                       .       c                  p            |
|                                       .        c               p              |
|                                       .        c             p                |
|                                       .        c           p                  |
|                                       .       c          p                    |
|                                       .     c         p                       |
|                                       .     c       p                         |
|                                       .   c      p                            |
|                                       . c      p                              |
|                                       .c    p                                 |
|                                       .cp                                     |
|                                      pc                                       |
|                                   p   c                                       |
|                                p     c.                                       |
|                             p       c .                                       |
|                           p        c  .                                       |
|                        p         c    .                                       |
|                      p         c      .                                       |
|                   p           c       .                                       |
|                 p            c        .                                       |
|               p             c         .                                       |
|            p                 c        .                                       |
|           p                  c        .                                       |
|         p                    c        .                                       |
|        p                    c         .                                       |
|        p                  c           .                                       |
|       p                  c            .                                       |
|       p                c              .                                       |
|       p                c              .                                       |
|       p                c              .                                       |
|       p                  c            .                                       |
|       p                    c          .                                       |
|       p                       c       .                                       |
|         p                      c      .                                       |
|           p                  c        .                                       |
|             p               c         .                                       |
|               p             c         .                                       |
|                 p           c         .                                       |
|                   p          c        .                                       |
|                      p        c       .                                       |
|                        p        c     .                                       |
|                           p      c    .                                       |
|                             p      c  .                                       |
|                                 p   c .                                       |
|                                    pc .                                       |
PHASECAL = 1.00
```

### Re: Integer Maths is Magic!

Quote:

There seems to be an assumption here that accurate measurements can only be taken if the sampling process is phase-locked to the waveform that is being measured.  I don't understand why this should be so.  The only important thing, surely, is to sample at regular intervals.

With a sinusoidal waveform, there is much more power and energy transferred in the parts with higher voltages than in the parts with lower voltages.

For a 'Gedankenexperiment'  (thought experiment?) go to an extreme: If you have a measuring interval of 5 ms (@50Hz mains, means 20 ms mains period), then this 5 ms interval can either lie around the maximum voltage in the sine curve or around the zero crossing. You get a much higher RMS value for voltage (and current) around maximum than around the zero crossing. So it clearly depends where your measuring interval lies within the mains period (if they are not equal in length).

If you take a less extreme example with a measuring interval of 19,8 ms (50Hz, 20 ms mains period) then taking e.g. 100 RMS values will give a 'beat' effect in the results. They will slowly move up and down (in a sinusoidal form) . The sampling of a waveform with a frequency different to the frequencies in the waveform will always introduce 'mixing' results. (This is one of the reasons why a 'Nyquist filter' should always be used when sampling analog waveforms with an adc).

PS: one additional point: an RMS value is only defined over one complete period of the underlying waveform. If you measure only a fraction (or an 'inexact' (?) multiple) of the period, then you do not get the RMS value! This might sound rather theoretical, but has exactly the above mentioned consequences.

AND: if you measure over a large number of periods, then the error gets smaller and smaller and smaller. This is how integrating meters work (more or less).

### Re: Integer Maths is Magic!

An interesting point re PLL is what you lock on to. Remember, the voltage and current can be slightly (or even wildly) out of phase.

Which does it make most sense to lock to, if doing PLL style sampling?

Voltage I'm guessing, as the phase of that won't change. And yet it's the current sample (which can change phase) which is of most interest.

P.

### Re: Integer Maths is Magic!

Quote: An interesting point re PLL is what you lock on to. Remember, the voltage and current can be slightly (or even wildly) out of phase.

Doesn't matter principally. Even if they are out of phase, they have the same frequency (in general). But it will off course be much easier to lock on to voltage as this is always there (not talking about mains outages :-) )! And it has clearly defined zero crossings most of the time (or maxima if you lock to these).

### Re: Integer Maths is Magic!

I wonder if randomly sampling each cycle might be a good approach (as long as you sample many waveforms per RMS calculation:   law of large numbers and all that).  i.e. you take, say, 50 samples per cycle but these are randomly spaced.  As Martin points out, we're doing integration over time, and a much-studied numerical approach for integration uses random sampling: Monte Carlo Integration.  (of course, creating a true random number generator on an Arduino is not a trivial problem).  Random sampling would avoid any "beat frequency" problems (and, like Robin, I'd be a little worried that a software PLL may "jitter" too much, especially given that we're not sampling rapidly enough to actually measure zero-crossings so we have to interpolate to find where the zero crossings should be).

Robin wrote: "I do hope that someone else will be able to supply an equivalent trace at some stage"

Whilst not exactly equivalent, I posted some current traces (recorded by my emonTX at 47 samples per cycle) here.

By the way, my empirical experiments with 17 samples-per-cycle versus 45 samples-per-cycle are leading me to believe that rapid sampling is a nice-to-have feature but may not actually improve accuracy much.  But I was only measuring current (not voltage) so perhaps larger effects will be seen if sampling both voltage and current.  My results are here.

### Re: Integer Maths is Magic!

If you are measuring a wave with high frequency content (a triac-chopped wave for example) then a high sample rate is essential (Nyquist again). If you're not and you have a pure sine wave, then only two samples per cycle are enough - Nyquist yet again.

Coming back to earth, there is less and less energy in the higher harmonics, so as Jack suggests, it makes little difference with a "normal" sort of wave shape. If you are taking 14 samples per cycle, you should read the 7th harmonic (350 Hz) accurately. If you are taking 46 samples, you should read the 23rd harmonic (1150 Hz).

"I'd be a little worried that a software PLL may "jitter" too much, especially given that we're not sampling rapidly enough to actually measure zero-crossings so we have to interpolate to find where the zero crossings should be"

Er... If I understood the PLL code correctly, that isn't true. It actually locks so that a measurement happens on the zero crossing - it uses any error measured there to adjust the frequency to maintain lock.

I've no idea what the short-term jitter of the mains is, but if you realise that every generator on the system (both those actively generating and the spinning reserve) represents a flywheel weighing many tons, it can't be all that fast.

### Re: Integer Maths is Magic!

Quote: I've no idea what the short-term jitter of the mains is, but if you realise that every generator on the system (both those actively generating and the spinning reserve) represents a flywheel weighing many tons, it can't be all that fast.

This is the point! The mains fundamental frequency is amazingly stable, but higher frequency content can lead to severe zero crossing distortions which can bring a PLL out of synch or introduce a fixed or varying phase shift (depending on the nature of the distortions).

Quote: If .... you have a pure sine wave, then only two samples per cycle are enough

Enough for what? They are enough to 'recognize' this single frequency in the resulting data. They are not enough to sample the waveform (what we need for real power measurement).

### Re: Integer Maths is Magic!

Jack, I have a Bosch hot-air gun which operates just like your hair-drier.  At low-power, they each only use half of the waveform; it's cheap, and effective for the manufacturer.  As RW has pointed out on your hardware thread, the CT is always adjusting itself so that the DC content of its output waveform moves towards zero.  This process happens all the time, but is less apparent when the waveform is sinusoidal.

It's for precisely this reason (the floating nature of our V and I sensors) that I decided to reinstate independent filters for each of them.  Otherwise, when the current waveform changes shape, as it does with our half-power appliances, we'd be subtracting an inappropriate level of DC offset.  This would cause the calculated power value to be incorrectly inflated.  Warning - micropence alert!!

### Re: Integer Maths is Magic!

Not sure whether this is the right place to post this one as it's not actually to do with integer maths.  It is however to do with analogRead() which was discussed earlier, and the right people seem to be on board, so here goes ...

Here's a couple of sketches to show how analogRead() and the equivalent low-level method behave:

AnRead_speedCheck does 1000 pairs of analogRead() statements and then displays the amount of time taken.  I always see a time of 223mS which is just over 89 loops per mains cycle period.

ADC_speedCheck does the same workload but using low-level ADC instructions.  As expected, the time taken is exactly the same.  By use of the delay_microseconds variable, delay can be inserted while ADC conversions are underway.  My rig can accept anything up to 100uS without any change to the results.  Above 100uS, the overall time taken starts to become extended.

This sketch also has a flag which is 'cleared' if there is spare time to do so.  If it remains 'set' when the ADC operation has finished, a counter is incremented by one.  This mechanism provides another way of quantifying the amount of usable time that's available per analogRead() instruction.

Not sure whether this material will be much use to anyone, but it seemed like a good idea anyway.  This kind of flag could be a useful addition for a Mk2a test build.  If the flag ever fails to be cleared, my argument about the regularity of Mk2a's sampling intervals would be somewhat blown out of the water!

(files added 27/2/13, maybe I forgot to attach them earlier)