Pain pressure treshold reliability of trained and untrained raters
Lance Ranek, SPT, Shaina Lonneman, SPT, and Hailey Lucht, SPT
20 Comments on “Pain pressure treshold reliability of trained and untrained raters”
What are your thoughts on why the untrained tester had the most reliable test-retest measurements? Even though one of the testers was untrained, do you think that repeating the test for 39 subjects benefited untrained tester?
Great question Colton. As the untrained rater myself, I personally believe that after 39 subjects with several measurements I did unintentionally get comfortable with applying a pressure even if it may not have been a constant 5 N/s. However, we did not re-asses if the trained testers continued to apply a constant 5 N/S throughout the study which maybe should have happened once a week throughout the time period. With that being said, the untrained tester did yield measurements that are considered to have good reliability shows that possibly training to apply the 5 N/S may not be needed. Lastly, further analysis could be conducted to determine at what time period did tester three have the highest reliable measurements (beginning, middle, end) versus general reliability of the entire study to show if over time that unintentional training did occur. Hope this answers your question!
Hi Colton,
Great question, we were definitely not expecting the untrained rater to have the most reliable test-retest measurements. As we can not say for certain why these results occurred, there are a few possible factors may be that may have led to this. First off, even though the untrained tester did not partake in the training process, they were aware of the training that the trained testers went through and also aware of the rate of 5N/sec that was desired. Another thought is that we did not limit conversations in between the application of the algometer, thus a participants level of awareness may have been different depending on the different conversations the participant had with each of the testers. The suggestion you make about the untrained tester participating in the research project could have lead to the untrained tester becoming “trained” throughout the research, it would be interesting to look at the data in a different way to determine if the untrained tester was had increased reliability as the project went on. Additionally, the trained testers were only tested right after the training took place, so we are unable to say if the trained testers were able to consistently apply the 5N/sec throughout the entire study. Thanks for your question!
Would you hypothesize that PPT would change throughout a testing session as each location on a participant was tested 9 times? How would this affect/not affect your results?
Thank you for your question and yes, this was one of our limitations. There is a chance that sensitization could have occurred as nine measurements were taken at each location. We tried to prevent any soreness from occurring by waiting the time period between one testers measurements and between each tester. Additionally, we took took the average of the measurements from each tester to in case their was a large change. This could have altered our results with the patient either becoming more sensitive to the change in pressure resulting in a lower PPT or having a higher tolerance until a change occurred resulting in a higher PPT. This in return would affect the inter-rater reliability and the results less reliable between the three of us. We could have analyzed the order we tested and see if every time we tested in a certain order what the results and reliability was.
Olivia, this is definitely a factor that we thought about. I agree with Shaina’s response. Based off our literature review, our procedure and rest breaks were in congruence with previous research. However, it is still likely that some individuals became hyper or hypo sensitive during testing. Thanks for the question.
You mention that a topic of future research could be to determine if other training protocols would yield clinically significant results. Do you have any thoughts as to how the training protocol used would have been more effective? Is there a reason to develop a training protocol if the untrained tester did just as well and better even than the trained testers?
Thanks for your question Charlotte. The specific training protocol that we used was practicing applying a constant rate of pressure of 5N/sec for ten minutes a day for one week. I personally feel that this protocol could have been more effective if we would have consistently checked the trained testers ability to apply the constant force of 5N/sec throughout the length of the study; as it is difficult for us to determine if the trained testers were indeed able to apply this pressure consistently in the study, due to this only being checked prior to the start of subject participation. It is difficult to say if there is a need for a training protocol to get reliable results at this point, as the results of our study indicate that a protocol may not be needed. However, it would be beneficial for similar future studies to get results consistent with our results to further support that no training is needed to yield reliable results. Hope this answers your question!
Great Question, Charlotte. Agreeing with Hailey, it would be beneficial to make sure that the two trained raters remained “trained” at applying the 5 N/s throughout the study as the study lasted over several months. This could be fixed by checking at least once a week what pressure was being applied by the two testers the same way it was checked during the training phase of the study. This would help ensure that a constant rate of pressure was being applied throughout the study. If this was added to the protocol in future studies, and the untrained tester still had good reliability, training should not be necessary to gather PPT measurements. Hope this answers your question!
It is likely that each of your test subjects were of varying sizes and body compositions. Do you think there could have been any variability in your results based on the test subjects muscle size or the amount of body tissues covering the muscle? Did you have a specific method for finding the muscle belly for each site to make sure it was consistent with each patient?
Thanks for your question, Abi. There definitely could have been, and most likely was, variability based on multi-tissue size, composition, orientation, etc. But hopefully, this variation was controlled for each individual participant by marking the sites. Testing sites were selected based off of palpation; there were no precise measurements taken. For our study, we were looking at the reliability among the raters, so we were not too worried if there was variability between participants; just so that we kept each individual’s testing location consistent. That is why we marked each testing site and asked the participants to remark as needed until retest. As long as each rater took PPT measurements at the same location for the same participant, our results should have been reliable, even if different participant’s testing locations were a little different. Hopefully this answers your question.
Great question Abby. Lance answered the question the same way I would have. The only thing that could have played a factor is when the patients forgot to remark their testing locations. Further analysis could have been ran for the participants who remembered to remark versus the one’s who forgot to see if the results were still reliable or not. With that being said some participants clearly remarked their testing location in different spots. This is when we would re-palpate and remark the location ourselves. We did note which participants remembered to remark or not.
How did you decide on the parameters of 5N/s for the application of pressure and the locations of first dorsal interosseous muscle and tibialis anterior? Based on your study, do you think that future research should continue to use these parameters?
Thanks for the question, Hans. Hailey answered this question how I would have. I think it would be interesting to see how a variation of rate of pressures (2 N/s, 5 N/s, 10 N/s, etc) would compare. This would give a better idea of which rate of pressure may be most appropriate. What may be appropriate for one body part might be different than another part. There are a lot of factors to consider!
Thanks for your question Hans. We chose the 5 N/s as desired pressure rate after reading through past research on PPT. Many of the articles included in our literature review used 5N/s as a target for the rate of pressure applied, although there were studies that used other rates, we chose this with hopes to be consistent with the literature and be able to better compare our results to the current research. As for the testing locations, these two locations had been used in previous studies in addition to a few others. We decided on the first dorsal interosseous and tibialis anterior because these locations are easily accessible and convenient for participants to expose for testing purpose. To answer your second question, it is difficult to say if future research should continue to utilize these parameters. The specific results of our study would indicated that these parameters may not be necessary to yield reliable and consistent results as our untrained tester had some of the most reliable results. However, there could be multiple factors into this including the training protocol we used not being as effective as we would have liked or that the untrained tester was also applying the constant rate of pressure without even knowing it. Hope this helps, thanks for your question!
Hailey answered this question perfectly,. It may be beneficial in future studies to perform other areas with untrained vs trained raters too to see if they get similar results. In addition, our study could be replicated to see if the results would be similar. They could improve the training protocol to ensure the 5 N/s were consistently applied throughout the entire length of the study.
Interesting results! In Table 3, any thoughts as to why the Rater 2 initial mean was higher than the retest mean for Site 1 Left? It seems the means for the rest were lower at the initial test time compared to retest time.
That is a great observation you made! I do not think I ever noticed that. The only thing I could think of if the patients were more sensitive or aware of the sensation change. We did not analyze the order every time Rater 2 took her measurements as we just randomized it. That could have played a factor to the mean.
I see there was a random order of trainers for each new subject, but is there a possibility that after the first and second examiner, the patient was more familiar with the test and could associate the length of time it would take to reach that change in sensation instead of focusing on that change in feeling every single time? Could the participant still be familiarizing themselves with what is an actual change in sensation and be producing inaccurate results?
That is an interesting way of thinking about it. It could have played a factor but it usually within 30 seconds the subject vocalized the change. That would be another thing to consider. To answer your second question, the subject could still be trying to familiarize their selves with the change but we tried to limit this as much as possible with the multiple trials. We did discuss throwing out the first trial to help prevent this but we did an average of three. The results did not show a large change as for the most part all had good reliability.
What are your thoughts on why the untrained tester had the most reliable test-retest measurements? Even though one of the testers was untrained, do you think that repeating the test for 39 subjects benefited untrained tester?
LikeLike
Great question Colton. As the untrained rater myself, I personally believe that after 39 subjects with several measurements I did unintentionally get comfortable with applying a pressure even if it may not have been a constant 5 N/s. However, we did not re-asses if the trained testers continued to apply a constant 5 N/S throughout the study which maybe should have happened once a week throughout the time period. With that being said, the untrained tester did yield measurements that are considered to have good reliability shows that possibly training to apply the 5 N/S may not be needed. Lastly, further analysis could be conducted to determine at what time period did tester three have the highest reliable measurements (beginning, middle, end) versus general reliability of the entire study to show if over time that unintentional training did occur. Hope this answers your question!
LikeLike
Hi Colton,
Great question, we were definitely not expecting the untrained rater to have the most reliable test-retest measurements. As we can not say for certain why these results occurred, there are a few possible factors may be that may have led to this. First off, even though the untrained tester did not partake in the training process, they were aware of the training that the trained testers went through and also aware of the rate of 5N/sec that was desired. Another thought is that we did not limit conversations in between the application of the algometer, thus a participants level of awareness may have been different depending on the different conversations the participant had with each of the testers. The suggestion you make about the untrained tester participating in the research project could have lead to the untrained tester becoming “trained” throughout the research, it would be interesting to look at the data in a different way to determine if the untrained tester was had increased reliability as the project went on. Additionally, the trained testers were only tested right after the training took place, so we are unable to say if the trained testers were able to consistently apply the 5N/sec throughout the entire study. Thanks for your question!
LikeLike
Would you hypothesize that PPT would change throughout a testing session as each location on a participant was tested 9 times? How would this affect/not affect your results?
LikeLike
Thank you for your question and yes, this was one of our limitations. There is a chance that sensitization could have occurred as nine measurements were taken at each location. We tried to prevent any soreness from occurring by waiting the time period between one testers measurements and between each tester. Additionally, we took took the average of the measurements from each tester to in case their was a large change. This could have altered our results with the patient either becoming more sensitive to the change in pressure resulting in a lower PPT or having a higher tolerance until a change occurred resulting in a higher PPT. This in return would affect the inter-rater reliability and the results less reliable between the three of us. We could have analyzed the order we tested and see if every time we tested in a certain order what the results and reliability was.
LikeLike
Olivia, this is definitely a factor that we thought about. I agree with Shaina’s response. Based off our literature review, our procedure and rest breaks were in congruence with previous research. However, it is still likely that some individuals became hyper or hypo sensitive during testing. Thanks for the question.
LikeLike
You mention that a topic of future research could be to determine if other training protocols would yield clinically significant results. Do you have any thoughts as to how the training protocol used would have been more effective? Is there a reason to develop a training protocol if the untrained tester did just as well and better even than the trained testers?
LikeLike
Thanks for your question Charlotte. The specific training protocol that we used was practicing applying a constant rate of pressure of 5N/sec for ten minutes a day for one week. I personally feel that this protocol could have been more effective if we would have consistently checked the trained testers ability to apply the constant force of 5N/sec throughout the length of the study; as it is difficult for us to determine if the trained testers were indeed able to apply this pressure consistently in the study, due to this only being checked prior to the start of subject participation. It is difficult to say if there is a need for a training protocol to get reliable results at this point, as the results of our study indicate that a protocol may not be needed. However, it would be beneficial for similar future studies to get results consistent with our results to further support that no training is needed to yield reliable results. Hope this answers your question!
LikeLike
Great Question, Charlotte. Agreeing with Hailey, it would be beneficial to make sure that the two trained raters remained “trained” at applying the 5 N/s throughout the study as the study lasted over several months. This could be fixed by checking at least once a week what pressure was being applied by the two testers the same way it was checked during the training phase of the study. This would help ensure that a constant rate of pressure was being applied throughout the study. If this was added to the protocol in future studies, and the untrained tester still had good reliability, training should not be necessary to gather PPT measurements. Hope this answers your question!
LikeLike
It is likely that each of your test subjects were of varying sizes and body compositions. Do you think there could have been any variability in your results based on the test subjects muscle size or the amount of body tissues covering the muscle? Did you have a specific method for finding the muscle belly for each site to make sure it was consistent with each patient?
LikeLike
Thanks for your question, Abi. There definitely could have been, and most likely was, variability based on multi-tissue size, composition, orientation, etc. But hopefully, this variation was controlled for each individual participant by marking the sites. Testing sites were selected based off of palpation; there were no precise measurements taken. For our study, we were looking at the reliability among the raters, so we were not too worried if there was variability between participants; just so that we kept each individual’s testing location consistent. That is why we marked each testing site and asked the participants to remark as needed until retest. As long as each rater took PPT measurements at the same location for the same participant, our results should have been reliable, even if different participant’s testing locations were a little different. Hopefully this answers your question.
LikeLike
Great question Abby. Lance answered the question the same way I would have. The only thing that could have played a factor is when the patients forgot to remark their testing locations. Further analysis could have been ran for the participants who remembered to remark versus the one’s who forgot to see if the results were still reliable or not. With that being said some participants clearly remarked their testing location in different spots. This is when we would re-palpate and remark the location ourselves. We did note which participants remembered to remark or not.
LikeLike
How did you decide on the parameters of 5N/s for the application of pressure and the locations of first dorsal interosseous muscle and tibialis anterior? Based on your study, do you think that future research should continue to use these parameters?
LikeLike
Thanks for the question, Hans. Hailey answered this question how I would have. I think it would be interesting to see how a variation of rate of pressures (2 N/s, 5 N/s, 10 N/s, etc) would compare. This would give a better idea of which rate of pressure may be most appropriate. What may be appropriate for one body part might be different than another part. There are a lot of factors to consider!
LikeLike
Thanks for your question Hans. We chose the 5 N/s as desired pressure rate after reading through past research on PPT. Many of the articles included in our literature review used 5N/s as a target for the rate of pressure applied, although there were studies that used other rates, we chose this with hopes to be consistent with the literature and be able to better compare our results to the current research. As for the testing locations, these two locations had been used in previous studies in addition to a few others. We decided on the first dorsal interosseous and tibialis anterior because these locations are easily accessible and convenient for participants to expose for testing purpose. To answer your second question, it is difficult to say if future research should continue to utilize these parameters. The specific results of our study would indicated that these parameters may not be necessary to yield reliable and consistent results as our untrained tester had some of the most reliable results. However, there could be multiple factors into this including the training protocol we used not being as effective as we would have liked or that the untrained tester was also applying the constant rate of pressure without even knowing it. Hope this helps, thanks for your question!
LikeLike
Hailey answered this question perfectly,. It may be beneficial in future studies to perform other areas with untrained vs trained raters too to see if they get similar results. In addition, our study could be replicated to see if the results would be similar. They could improve the training protocol to ensure the 5 N/s were consistently applied throughout the entire length of the study.
LikeLike
Interesting results! In Table 3, any thoughts as to why the Rater 2 initial mean was higher than the retest mean for Site 1 Left? It seems the means for the rest were lower at the initial test time compared to retest time.
LikeLike
That is a great observation you made! I do not think I ever noticed that. The only thing I could think of if the patients were more sensitive or aware of the sensation change. We did not analyze the order every time Rater 2 took her measurements as we just randomized it. That could have played a factor to the mean.
LikeLike
I see there was a random order of trainers for each new subject, but is there a possibility that after the first and second examiner, the patient was more familiar with the test and could associate the length of time it would take to reach that change in sensation instead of focusing on that change in feeling every single time? Could the participant still be familiarizing themselves with what is an actual change in sensation and be producing inaccurate results?
LikeLike
That is an interesting way of thinking about it. It could have played a factor but it usually within 30 seconds the subject vocalized the change. That would be another thing to consider. To answer your second question, the subject could still be trying to familiarize their selves with the change but we tried to limit this as much as possible with the multiple trials. We did discuss throwing out the first trial to help prevent this but we did an average of three. The results did not show a large change as for the most part all had good reliability.
LikeLike