Normalized Power

What’s the best method to calculate normalized power?

Thesis: There probably isn’t one, but the PeaksWare method certainly isn’t the best.

Background: The objective of a normalized power metric is to report a value for relative ‘difficulty in output’ in executing an effort with a certain power profile. The specific desire is that this representative of the metabolic and cardiovascular challenge of the effort, it is not generally used to assess neuromuscular or strength or anaerobic difficulty. There have been additional metrics built off of Normalized Power, most notably V.I. for assessing whether or not an effort was well paced. I want to be specific that the criticism of NP algorithms that follows is an evaluation on the above-stated goal only, not on all of the various interpretations and rules of thumb based on it.

Normalized Power herein refers specifically to Normalized Powerâ„¢, the most widely spread variant of this algorithm used by cyclists, trademarked in 2013 by PeaksWare LLC, the owners of TrainingPeaks and WKO software.

The PeaksWare method for calculation normalized power has two pieces of logic applied. Part 1, the power sequence is smoothed to better represent the load on the cardiovascular system. Heart rate, breathing rate, glucose consumption are smooth changes that occur over many seconds, more seconds than the Adenosine triphosphate and creatine phosphate energy systems in the muscles can supply. Part 2, the power sequence is weighted. The logic here is that the difficulty in sustaining the production of power goes up really quickly above threshold because limited reservoirs are being drawn down. Those reservoirs are basically blood sugar and blood oxygen content, and to a lesser degree, muscle and liver glycogen. They are not ATP and CP, those were theoretically addressed by the Part1 smoothing. Part 3, this weighted sequence is averaged to get a single value.

The PeaksWare method of calculation is to use a 30 second square window moving average to pre-smooth the power sequence and use a 4th power weighting function on the smoothed profile. The TrainingPeaks guys claim that this algorithm combined with these parameters achieves the outcome of representing metabolic and cardiovascular challenge, and that the balance of these parameters is generally correct for everyone and generally correct for all variations of efforts.

This is the point where everyone’s bullshit alarm should be blinking red and the sirens sounding. The 30 seconds square window moving average is clearly not correct for everyone. Some people are absolutely murdered by power variations on the order of 10 seconds, some people are not. Early in a base phase of training, going a little “into the red” can be a workout ender and require very long recoveries, whereas in the midst of a good season when primed up for racing, an athlete can go deeper into the red and recover from it faster. That’s a change to the anaerobic capacity following training that system. Should the results of a metric that purportedly assesses metabolic and cardiovascular challenge be sensitive to the high variability between athletes in anaerobic capacity? No, the metric should be generally insensitive.

Quick test on the PeaksWare method. 10 minute ride total. 9.5 minutes spent at 50% of FTP with a 30second sprint at 3x FTP in the middle of it. The average power is 63% of FTP. I would suggest that the normalized power for this effort should be lower than FTP. The athete didn’t demonstrate the capacity to cardiovascularly do anything above FTP. They showed an OK sprint. What does the PeaksWare method suggest for NP for this effort? 120% of FTP. I generally think that’s wrong.

Further, it’s my assessment (based on my collected power data alone) that the parameters of 30seconds and 4th power, even if they give good results for certain kinds of efforts, are more sensitive to the algorithm than they should be. If we had a better algorithm for normalized power we wouldn’t need to memorize the laundry-list of situations where “normalized power doesn’t apply”. The concept always applies, the algorithm however, sometimes fails.

Onto a recent example: The 2upTT I competed in last week [link]. The effort is particularly appropriate to inform the discussion as it is both “max effort” and not perfectly even pacing on the short time scale but is relatively well paced on the long scale.

The below plot shows which NP is reported (y-axis) for each variation in the power averaging parameter (x-axis) and the power weighting (colour-axis). The dashed lines show the intersection of the PeaksWare algorithm gives a value of 336.8 W normalized power for that effort. Is it reasonable – yes, it’s a reasonable evaluation of the effort. BUT, I do want to highlight a couple features on this plot. The “steepness” of the output curve for any weighting (colour) around 30 seconds is much steeper than the steepness around 60-90 seconds. The meaning is that for *some* efforts the PeaksWare NP algorithm in the region that it is used is quite sensitive to the parameters of the algorithm.

Power Normalization

Followup plot to the above. I put the colour-axis from the above plot on the Y-axis (didn’t adjust colours) and then highlighted all of the combinations of exponent and an averaging window that give the same result as the PeaksWare parameters. Basically, there’s a tradeoff, the more power smoothing you apply to the raw data, the bigger the weighting exponent you need to boost the value for NP up. The less smoothing you apply, the less you need to boost the weighting of the hard efforts. The argument for these parameters cannot be made from one effort, and I am not making one based on the 2upTT being analyzed. Just showing that you can get the same answer many different ways.

Parameter_Map

The question of “why the 4th power” is a glaring one. That rate of scaling is a red flag for me. It may be the most appropriate parameter to put into an algorithm with flawed logic to yield a correctresults. That doesn’t mean it is a good solution to the overall objective. Let’s assess for a moment, what a 30sec 3xFTP sprint thrown into the effort should mean for normalized power. The PeaksWare algorithm is going to weight a portion of that effort as 81 times as demanding as continuing to ride at FTP. Considering 3x the power was transferred, the effort is really weighted “up” by 27 times. Does the body really respond cardiovascularly by such an enormous factor? My experience is no. Substrate consumption efficiency is measured as degraded in the lab when you draw down CP, but it’s not a factor of 27. A factor of more like ~4-5 seems more appropriate. That parameter doesn’t get the “correct” result in the PeaksWare algorithm, but it could mean that the algorithm and parameter are co-broken and compensating for one another.

Final critique: PeaksWare provides no satisfactory analog for instantaneous NP. Such a concept shouldn’t be impossible. As they’re not in the business of providing ANT+ scraping and display to head units (like Garmin for example) they have largely evaded this shortfall. If you ride along with some power variability, it is not logically impossible to assess what the instantaneous draw on the cardiovascular/metabolic systems in your body is/are. Instead of spouting that “instantaneous NP has no meaning”, it’s more appropriate to make your NP algorithm provide the meaning that is logically connected to the concept.


Now, it’s easier to critique than to provide solutions… and I am sure to be critiqued for the above because people love to get religious about their power numbers. So, I’ll present an alternative.

I want to draw on first principles for muscle/O2 transport/substrate consumption energy systems. I am going to guess the weighting factor a-priori. The argument is that burning anaerobic fuel is done at a discounted efficiency compared with aerobic fuel burn. When I ask muscles to generate power above FTP, I’ll agree that I’m going in debt, but it’s not the 27x debt from an exponent of 4, it’s more like a 4x or 5x debt. If we consider that theoretical 3x FTP sprint that generally an athlete can do with cadence on flat ground (reasonably achieved with CP system, not a strength/neuromuscularly limited 4-5x FTP sprint, that they also are using torque and may only be able to achieve sprinting uphill) the exponent should be between log(4×3)/log(3) & log(5×3)/log(3) = between 2.26 & 2.46. If you think you can only sprint at 2.5x FTP maybe the exponent, is 2.5-2.7, but then you’re probably getting old, or you need to work on getting your gainz!.

Then that debt has the potential to be repaid as you work under FTP as extra O2 and glucose are delivered. How long that takes is basically an assessment of how long it takes you to catch your breath after a sprint. Coach Corey typically wanted to know peak HR and HR 1 minute after cresting Emily Murphy Hill (2min @ ~FTP into 40sec max effort sprint), which was certainly not resting HR, but I was usually back to zone 2 with a coasting/pedaling recovery and sometimes all the way back to zone 1. I don’t really have any other assessment for how long it takes to catch my breath except for that one example. It doesn’t matter so much, whether the HR or breathing rate is back down, but those are the simple markers that your body is generally not still trying to “catch up” from an anaerobic effort for much more than a minute after the fact.

InstNormPower

OK, so one simple proposal for power normalization is that you would weight power numbers with an exponent (w) and then take an exponentially weighted moving average with a timeconstant T. The weighting is done first, representing the effect of the instantaneous cardiovascular efficiency of the effort. The time averaging models the impact to the breathing rate or HR over time. There are assumptions here, but without growing the model to include three parameters I don’t have a bright idea for a solution. The appropriate parameters are guessed to be 2.36 and a weight of maybe 1/20 or 1/30 each second. Impulse response of a weighting of 1/20 will have decreased by 80% within the minute which seems appropriate. A weighting of 1/30 will have reduced by 80% of the original response before 90 seconds. The summary metric for this normalized power for an effort is simply the average of the instantaneous normalized powers.

Power Normalization EWMA

To start to analyze this algorithm, let’s map this normalized power metric against the parameter space for the same TT. I am using the denominator from the exponential weighting as a proxy for the square window width from PeaksWare. They aren’t identical but they are analogs so I am going to plot the same parameter space and using the same axis for NP even though it overflows the top with this version of the algorithm. Increasing the weighting of an anaerobic effort increases the normalized power as expected. The larger weighings give rise to much larger values, the cause is the order of weighting then time averaging vs time averaging then weighting. Easy to observe from the plot that increasing the time averaging also increases the normalized power. With the PeaksWare method, you don’t get this effect “in general” although in some cases you will. The interpretation is based on the principle of what we modeled. That is: if you believe the impact of going anaerobic takes longer to recover from (longer time constant), you simultaneously believe that the cardiovascular performance requirement to make that effort is a higher benchmark. Interestingly after 15-20 seconds of weighting, the algorithm becomes less sensitive to this parameter. The plotted values of exponent 2.3 and 2.4 are demarcated on this plot, showing a NP estimate in the range of the one provided by the PeaksWare estimate is achieved. It’s actually unnerving how close.

Parameter_Map_EWMA

Now perhaps most interestingly. What is the instantaneous normalized power profile from the 2upTT. I am plotting here with a ^2.36 weighting and 25 second EWMA time constant.

Power_Summary_EWMA

click to enlarge

The trendline points out two really big things. The hardest part of team time trials is the sprint to get on the wheel of the person pulling through. Easy to see that when Will pulls through he is putting me in the hurt box big time, it takes the better part of my effort to recover from those spikes. They are prominent after rotation 1, 2, 3, 5, 7, 8, 9, 11, 12, 15… i.e. most of the swaps, I was most in the hurt-box after getting back on, not when I quit on the front. Interestingly, late in the race, it becomes more prevalent that I am resting when not on the front and going deep when I am on the front. The cause is basically that the speeds are higher due to the downhill and the draft is better.

Quick test on that sprint effort. 10 minute ride total. 9.5 minutes spent at 50% of FTP with a 30second sprint at 3x FTP in the middle of it. The average power is still 63% of FTP. Instantaneous normalized power peaks at 2.6x FTP which falls appreciably short of 3x FTP. That seems approximately correct to me, maybe a bit high. I had previously suggested that the normalized power for the effort as a whole should be lower than FTP and the result is indeed 72% of FTP. Increasing the EWMA time constant from 25 seconds to 90 would blunt the peak instantaneous NP to only 1.8x FTP, and change the result of the overall effort’s FTP by only 1%. This is not wholly unsurprising, I had previously shown that after 15sec the algorithm is not terribly sensitive to changes in this parameter.

Disadvantages of this algorithm: If you’ve got a really lopsided effort, going kinda hard in one part and really hard in the other part, it’s not going to give you as much “credit” towards an overall normalized power as the PeaksWare strategy would/does. If you think that’s a big disadvantage I’ll propose that you were construing more from your NP numbers than you should have been doing in the first place.

You might also like:

Taft Hill 2-up Team Time Trial

Power summary of the 2Up TT last night with Will.

I think both of us were pretty cooked at the end of Worlds the night before (I did 55 minutes solo to finish) but we opted to go for it anyways. We found out at the start line that there was actually going to be a race against Jim and Mike, we had previously been assuming it was essentially just a race against the clock and the best time of the series, previously set by Will and Tim 2 weeks earlier.

Power_Summary

click to enlarge

First off, a power summary. Race time was 37min15sec. Average power was 327 Watts. with a 5% fatigue curve assumption, I should be able to do 103% of FTP… and well… that really didn’t happen.

Pacing was typical for not having done a whole heck of a lot of practice. It *felt* like I was getting pretty good recoveries in between some of the very first efforts and so I was pushing myself when on the front. A lack of 2upTT practice meant that I probably wasn’t reading the situation quite correctly. I did mention to Will that he was killing me when he pulled through after a few swaps, I think that comment may have helped to reign in both of our efforts and generally it was pretty good for the middle two quarters of the race. I had a weak patch from 28-32 minutes, which shows up in the power a bit but is better evidenced on a later image showing I was taking shorter pulls there.

Power_Overlay

click to enlarge

Next plot shows I was doing 50-100 more watts on the front than when in the draft. The two lines come together on the most climbey-portion of the course which Will led over the climbs and I pulled on the front on the flats between. I was OK with this, I had climbed better than Will the day prior so I should probably let him set the pace when the grade was up.

Summary of this plot is 356 Watts average while on the front, 297 Watts average while in the draft.

Effort of Pulls

Next plot breaks up the efforts, showing the discrepancy between front and back position was greater on the way home. Aerodynamics playing a larger role when the grade was slightly downhill and the speeds were higher. Evidenced also by Jim and Mike and all their aero gear helping to pull out a few seconds from us on that portion of the course. The lower plot shows there’s a much higher power variability when on the back, trying to stay in the sweet-spot of the draft.

Length of Pulls

Next plot shows who was taking longer efforts on the front. Generally we were pretty fair, I did a couple longer ones early, I think partly to help calm the pace. I then was only taking short efforts on the front late in the game when my legs started to struggle. For Will to take 4 efforts in a row during the closing stages that were 15+ seconds longer than mine was really really solid. That was the only thing that kept us close to the Great Divide boys.

Sumtotal – Hickey on the front for 22 more seconds than me.

Cadence

click to enlarge

Final plot is cadence. I was in the money zone. Despite having sore legs I did have enough focus to execute correctly and keep the cadence up. When the muscles are blown out it’s even more important to shift as much stress as possible to the cardiovascular system and running 95+ rpm is key for me there. You can see the two climbs in the first half where I stood out of the saddle when Will was on the front, even there I was still doing pretty good cadence to try and keep the stress on the cardio.

(19 seconds coasting excluded from the trend lines)

You might also like:

The Motherlode

‘The Motherlode’ is the name given to the 210 mile course variant at the Gold Rush Gravel Grinder. The event leaves from Spearfish South Dakota and heads up and into the Black Hills. Last year I raced the 110 mile ‘Gold Rush’ course and decided to go for the big one this time around, more sightseeing opportunities… or something like that. There is also a ‘Gold Dust’ 70 mile option on offer.

You might also like:

The “Grand” Loop

Pre-dawn rollout.

A photo posted by Joshua Krabbe (@jdkrabbe) on

Two miles high at 9am. #WYMTM

A photo posted by Joshua Krabbe (@jdkrabbe) on


the map
the profile
Click to Enlarge

You might also like:

Gold Rush Gravel Grinder – 2016

Tim and I drove up to Spearfish Friday for Gold Rush and camped with James, Keith and David in the municipal campground about 400 yards from the start line. Also there were Cindy and Darcy and Darrel from First City who I think also both camped and Kurt (rides for another team in Ft.Collins) and his wife Jody who offered to be our emergency support in case of crisis. If you’ve been plagued by the DK200 vs Nick Frey clickbait circulating on the web in the last week you know that having someone’s cellphone number who you can call for the bailout if needed is a critical aspect to racing gravel.

I was feeling like I had achieved some sense of race fitness in the past couple weeks after a long spring of feeling like it was coming along but not quite there yet. Last weekend out in Steamboat the race plans had been scuppered and I ended up putting in a 9 hour training day… maybe that was good because it made sure I didn’t really need to drain the batteries completely racing that entire distance. Who knows if I would’ve been able to recharge them in time for this weekend. It did however, largely erase any fears I had about racing for 110 miles on gravel. I had a good workout Wednesday at Worlds and was feeling pretty confident that I could ride with the lead group and see how things would play out late in the race. General strategy was to follow anything that seemed reasonable and not follow anything that seemed unreasonable because it was going to be too hot to really recover and keep racing if you dug too deep early on. (That battery comments was a metaphor – in case you’re unsure about motors in the peloton)

A lead group of a dozen formed on the first short steep hill and two climbers gapped the main pack on an early roller a few miles later. I sensed quite a bit of resignation from the pack as the duo included last year’s winner and I ended up doing more than my share of towing. I went solo just before the crosswind ended not super keen to do work for everyone else all day long. It took about ten miles before two more bridged up to me and then as a trio we soon caught the leaders making five rolling into Aid 1 at mile 32.

Temperatures were probably already above 85 by this point and I had finished two bottles. I drank two more standing at the aid station and left a belly sloshing with fluids and two more full bottles in addition to my camelback which I hadn’t started yet. Extra time to drink and fill those two bottles meant I was detached from the front group but with forty miles to the next aid it seemed like the right thing. I couldn’t make the catch with a strong effort leaving the aid station. I imagined it was the dude from Lincoln NB whom I’d raced in Scottsbluff three weeks prior who was rallying the troops to make time on me, obviously the world was out to get me, heat stroke and conspiracy theories go hand in hand right? I had some ups and downs through the next bit, flatting partially (I was running stans and lost pressure down to about 15 psi) and I was struggling to eat anything in the heat. I think I chewed a bit of a luna bar for more than a mile at one point before managing to swallow part of it and spitting the rest out. I reminded myself that 5th place was a place worth racing for and soon after I patched all the cracks in my mental-game I was rewarded by seeing the first of the leaders come back to me on the road. I would pass him and another before draining all of my fluids 5 miles prior to arriving to the mile 68 aid station. Luckily I had a drop-bag with 98 oz water and a pepsi waiting for me at the aid where I arrived about 90 seconds after the lead pair had departed. I borrowed a floor pump from support crew Jody and ID’d that the tyre needed a tube added which requires disassembling the tubeless stem, not easy to do in a hurry, Would have been nice if I could’ve just added air, oh well. This was much better than doing it roadside.

I filled two bottles, and downed the remainder which retrospectively is something like a 58 oz chug. I rolled out with a distended stomach from all the fluids about 10 minutes off the lead but with renewed hope and hydration. I was looking forward to racing the remainder of the race with gravity on my side and now that my head was checked back in I was really enjoying things. I threw caution to the wind on the rifleridge descent (Strava KOM to prove it) and closed to within 7 minutes. I turned myself inside out on the beast of a climb to the cement ridge at mile 85 where the temperature is purported to have been 103 degrees and made a 45 second stop at the final aid downing two bottles and filling two more, holding the gap to only 7 minutes at the top. I rolled hard on the descent but didn’t make much time on the lone leader finishing second by about 6 minutes in the end. Total race time a shade over 6 and a half hours.

Find it – Grind it. #RoadsLikeThese #PedalPower

A photo posted by Joshua Krabbe (@jdkrabbe) on

You might also like:

Steamboat Springs – Rapha Prestige

This is where ma @boobicycles is at home! #RaphaPrestige

A photo posted by Joshua Krabbe (@jdkrabbe) on

Wildflowers and pedal-bikes. Only 53 miles to go! #RoadsLikeThese #RaphaPrestige

A photo posted by Joshua Krabbe (@jdkrabbe) on

Both all-women teams kicked our butts. Actually almost every team kicked our butts. #RaphaPrestige

A photo posted by Joshua Krabbe (@jdkrabbe) on

@run_no3 jamming up some gravel rollers on his @BooBicycles RS-X. Starting to get hot out here.

A photo posted by Joshua Krabbe (@jdkrabbe) on

@nateprewitt surveys the final 25 miles of #RaphaPrestige

A photo posted by Joshua Krabbe (@jdkrabbe) on

We made it and no one is dead! 230ish dirty kms and 3000ish m climbing. #RaphaPrestige

A photo posted by Joshua Krabbe (@jdkrabbe) on

Boo Bicycles loaned Tim and I some demo-bikes for the week leading up to and including this ride. Super generous of them, super fun to ride. I thoroughly thoroughly enjoyed riding it and now am plotting a way to save enough nickels and dimes to buy one of my own. Time will tell.

You might also like:

Robidoux Quick’n'Dirty

Met a cool dude Todd from Kearney who needed to make sure I gave @your_group_ride his best.

A photo posted by Joshua Krabbe (@jdkrabbe) on

You might also like:

Rattlesnake Rally

A photo posted by Joshua Krabbe (@jdkrabbe) on

A photo posted by Joshua Krabbe (@jdkrabbe) on

Results:

results

You might also like:

One Speed Open

I was riding fixed (48×17) and managed the little technical bits alright but really needed to slow down.

A photo posted by Joshua Krabbe (@jdkrabbe) on

You might also like:

Stagecoach Classic 2016

Rooftop Cartegena

I returned to Winter Park this past weekend to take in the 3rd annual stagecoach classic. I had attended the 2nd edition after having missed out on the inaugural event because I was in Cartegena with Jason in January 2014 and too busy drinking Club Colombia, sitting beside rooftop pools, and riding bikes up and down giant mountain passes to think about zipping around the nordic trails.

Notable discussion at the finish line about how/why the waxing was apparently slower this time around compared to last year even though the conditions should have been faster with the warm conditions. The tracks were perhaps a little softer than the previous year, but not so soft that they were breaking down as a result of hard kicking. Following the weekend I crunched the numbers to compare finishing times of people who competed both years. On average people were about 3 minutes slower than last year with 31/44 of those competing both years being slower than last.

Histogram
Also calculated as a percentage of finish time.

The wax probably wasn’t really the contributing factor. Upon review of my GPS data there were some course modifications, in particular one notable one that made the course longer. Mystery seemingly solved. I felt notably better on the long steep climb at km 8 than I did the previous year and felt notably worse in the double-poling sections than I did last year… probably due to having lived at 5000 ft long enough to have made some notable adaptations and also due to weak-ass abs. I served myself some significant double-poling intervals the next day to inflict some damage. Some core-improvement before the Birkie would really be nice, it’s too late to make a huge difference but probably worth trying.

Main Difference
The field near km ~17. Red line shows the longer 2016 route.

Finish
Finish descent.
Altered due to new road I think.
Km8
After Aid #1.
Different but not longer.

Overall I’m a fan of the changes, made the course closer to 30kms, it was still a bit short. There are some simple loops that could be added around km 7-8 that could add another bit of distance to the front half of the race as well without dramatically altering anything or making things too much more difficult, just padding in some extra distance.

Related Posts with Thumbnails
You might also like: