Data Variability
We are considering two experimental procedures (A and B). Part A already has collected data and is trying to
find a urinary analyte that can be used as a marker of urine production rate. Part B estimates the required
number of participants for a clinical trial of a new hypertensive treatment. Both Part A and B should be
written up as scientific reports with an introduction, aims, hypotheses, results, discussion and conclusions.
The methods description only has to describe the data handling (statistical) processes that were performed.
Part A – Data variability
1- For this exercise, use the accompanying spreadsheet Data spreadsheet (excel file attached) to calculate the
mean and standard deviation for each of the variables in this group of participants for each day that they are
monitored. The male participants have 24h collections of urine for a number of days in sequence. There is
missing data so ensure that you have the correct order of days and parameters.
2- For each excreted parameter, calculate value as an excretion rate/24h compared to the urine volume for
each day.
3- Plot the average data for each parameter (mean value). Use error bars to denote the SD. Put the urinary
parameter on the Y axis and the urine volume for each day in sequence on the X axis.
4- Comment on the variability in the data for each parameter and the analytes that correlate best with urinary
volume. This activity investigates if there is a suitable urinary analyte that can be used as an estimate of
urinary volume over time (ml/min), without the need to time the collection and measure the volume.
Part B – Sample Size
1- For ethical and efficiency reasons, it is good practice to determine the minimum number of participants
required for valid statistical analysis. This is called a Power calculation and is dependent on a number of
parameters including the variation/range in the population, normal distribution of the elected cohort, the
degree of accepted confidence in the test (usually >80%) and the minimum P-value accepted as significant
(usually <0.05).
2- A company wants to test if its new formulation X is better than the current drug C. Use the following
variables to calculate how many participants are required under the following circumstances: It is known
from published studies that the reduction in the blood pressure of hypertensive patients can be regarded as
being normally distributed during treatment with both drugs. It is also known that drug A reduces the blood
pressure of hypertensive patients by a mean value of about 10 mm Hg. Previous studies indicate that
formulation X is more potent than drug C and will reduce mean blood pressure by about 15 mm Hg. This is
5/17/2020 Writers Hub – Freelance Writing
https://www.writershub.org/writer/orders/590605#instructions 5/7
regarded as a clinically relevant improvement. Moreover, clinical knowledge suggests that the standard
deviation of the reduction in blood pressure with both drugs can be taken as 5 mm Hg. The level of
significance is the probability of obtaining a statistically significant test result, even when there is no real
difference. This is conventionally taken as 2.5% for one-tailed tests. Nevertheless, other values would be
conceivable, depending on the question to be answered. The statistical power is the probability of identifying
a real difference with the statistical test and is often taken as 80%.
3- Use the ClinCalc Sample Size Calculator to determine the number of participants required if the predicted
decrease in blood pressure is 14, 15 or 16 mmHg.
Link: https://clincalc.com/stats/samplesize.aspx
4- Repeat the calculation but this time changing the standard deviation to 4, 5 and 6 and keeping the decrease
in blood pressure at 15 mmHg.
5- Tabulate your results for each of the predicted outcomes and comment on the influence of the outcome on
the required sample size.
6- Describe what the possible Type 1 and Type 2 errors are and how they can be minimised using the above
process