A Statistical Analysis of Electricity Costs Data

212 views 6 pages ~ 1446 words
Get a Custom Essay Writer Just For You!

Experts in this subject field are ready to write an original essay following your instructions to the dot!

Hire a Writer

Suppose that the mean download time for a commercial tax preparation site is 2.0 seconds. Suppose that the download time is normally distributed, with a standard deviation of 0.5 second.  What is the probability that a download time is;

(a)  Above 1.8 seconds?

μ=2, σ=0.5

x=1.8 à z=(x-μ)/σ = (1.8-2)/0.5=-0.4

P(x>1.8) =P (z>-0.4) = 0.6554

Therefore probability that download time is above 1.8 seconds is 65.54%.

(b) Between 1.5 and 2.5 seconds?

x=1.5à z=(x-μ)/σ = (1.5-2)/0.5=-1

x=2.5 à z=(x-μ)/σ = (2.5-2)/0.5=1

P (1.5

Therefore probability that download time is between 1.5 and 2.5 seconds is 68.27%.

(c)  99% of the download times are slower (a higher number of seconds   taken to download) than how many seconds?

99th

percentile: z=2.326

z=(x-μ)/σ àx=z*σ + μ

x=2.326*0.5+2=3.163

Therefore 99% of download times are slower than 3.163 seconds.

Part B

Introduction

Given a set of data on electricity costs from fifty randomly selected one-bedroom apartments as below; 82, 90, 95, 96, 102, 108, 109, 111, 114, 116, 119, 123, 127, 128, 129, 130, 130, 135, 137, 139, 141, 143, 144, 147, 148, 149, 149, 150, 151, 153, 154, 157, 158, 163, 165, 166, 167, 168, 171, 172, 175, 178, 183, 185, 187, 191, 197, 202, 206, 213.The paper presents findings from on the data characteristics such as the mean, median, maximum, minimum and standard deviation etc. describing the basic characteristics of normal distribution. The characteristics of the this sample’s distribution are compared to that of the normal distribution.It also presents a box plot, a histogram and a quantile-quantile normal probability plot which are used to determine if the data is normally distributed.

Testing Normality

Box Plot

A box plot is generally used to indicate the positions of the five number summary from a set of data. The interior of the box represents the interquartile range which is the difference between the third and the first quartile which consists half the distribution. Whiskers are extended to mark the minimum and maximum values of the data while the line subdividing the interior part of the box shows the position of the median value.

Table 1: Box Plot Statistics

Utility Charge (dollars)

Minimum

82

Q1

127.25

Median

148.5

Q3

167.75

Maximum

213

Figure 1: A box plot to test normality of electricity costs for 50 one-bedroom apartments

Table 1 shows the quartile values of the electricity costs data. From the data, 13 apartments have electricity costs between $127.25 and $148.5 which represent 26% of the data set while 12 apartments have electricity costs between $148.5 and $167.75 which represent 24% of the data set. Combining the two halves of the box plot indicates that 50% of the apartments’ electricity costs fall within the first and third quartiles of the data. Figure 1 gives an implication that data is normally distributed since the box seems to be equally subdivided by the median line. However, the difference between the median and Q1 is 21.25 and the difference between the third quartile and the median is 19.25. This shows that the data only exhibits a difference of 2 points between the two halves of box plot. From the random sample of fifty apartments presents no outliers in the data and the box plot is relatively centrally placed between the whiskers which implies that the data is not skewed and is normally distributed.

Histogram

Table 2: Histogram statistics

Bin

Frequency

82

1

95.1

2

108.2

3

121.3

5

134.4

6

147.5

7

160.6

9

173.7

7

186.8

4

199.9

3

213

2

More

1

Figure 2 shows that the electricity data distribution has a single peak, containing 9 apartments as indicated in Table 2, which represent 18% of the electricity costs among the 50 households. The bin containing the peak also contains the median electricity cost of $148.5 while the mean of $147.06 is contained in the preceding bin (147.5) as given in the data characteristics table below. There is only one apartment which uses $82 as electricity costs which represent the minimum value in the data set while 3 other apartments which represent 6% of the data are above $200 of electricity costs. No outliers are found in the data since no individual value is outside the overall pattern. The bin 186.8, shows a steeper reduction in the number of apartments having electricity costs above the third quartile which is $167.75 dollars. Hence the histogram does not reflect a mirror image of the bins to the right and those to the left of the peak bin. However, save for the huge span of data in bin 186.8, the distribution of the data shows more characteristics of a symmetry rather than right or left skewness.

Comparison of Data Characteristics to Theoretical Properties

Table 3: Data characteristics

Column1

Mean

147.06

Standard Error

4.481837269

Median

148.5

Mode

130

Standard Deviation

31.69137525

Sample Variance

1004.343265

Kurtosis

-0.544163238

Skewness

0.015845641

Range

131

Minimum

82

Maximum

213

Sum

7353

Count

50

 

                 The mean and median of a normal distribution are said to be equal. As indicated by the Table 3 above, the average electricity costs, mean = 147.06 while the central value in the data, median = 148.5. The two measures of central tendency are approximately equal which makes our data symmetrical and oriented to the normal distribution.

A normal distribution is defined using the mean, µ and standard deviation, σ. The electricity costs from the fifty apartments are distributed with mean, µ = 147.06 and standard deviation σ = 4.48.The interquartile range of a normal distribution can be described by the equation:

which would yield an IQR equal 42.7504 for this data. Calculating the interquartile range from Table 1 using the formula Q3 – Q1, gives 40.5. This value roughly places our data interquartile range at approximately 1.35 times the standard deviation. The range in normally distributed data is taken to be 6 times the standard deviation which would be, Range = 190.1483. The range from our data set is however 131.

Normal distributions can defined by the empirical rule which states that, 68% of the data should be within one standard deviation of the mean, 95% of the data be within two standards deviations of the mean and 99.7% of the data be within three standard deviations of the mean[1].

Table 4: Percentage of data within 1, 2 and 3 standard deviations of the mean

Lower Limit

Upper Limit

Count

Percentage

Within 1 std.

115.3686247

178.7513753

33

66

Within 2 std.

83.6772495

210.4427505

48

96

Within 3 std.

51.98587425

242.1341258

50

100

Table 4 above presents the percentages presented by the data, within 1, 2, and 3 standard deviations of the mean. The values lying within 1 standard deviation are 66% and approximately equal to the necessary 68%, those within 2 standard deviations constitute 96% while is approximately equal to the required 95% and the values lying within 3 standard deviations constitute the whole data set, 100% which is an approximate estimate of 99.7%.

Skewness is a measure of lack of symmetry of a distribution which means that for a symmetrical distribution skewness should be equal to zero[2]. From the electricity data presented, skewness is slightly above zero standing at 0.015845641 which means that the data is has a slightly longer right tail which cannot necessary place the distribution of the data in the asymmetrical kind. The skewness of the data is approximately zero which is an indicator that the data is symmetrical.

Kurtosis on the other hand is the measure of the pitch of a distribution curve relative to a normal distribution. It categorises distributions into; those with relatively large tails, lepkurtic, those with their kurtosis similar to that of the normal distribution, mesokurtic and those with small tails, platokurtic[3]. Kurtosis in a normal distribution is 3, which changes to zero after subtracting zero to get the excess kurtosis. The value of kurtosis is a number in the range of -3 and +3. The electricity costs data set has a kurtosis of -0.544163238 indicates that the distribution has a slightly flatter curve than that of a normal distribution having the same mean and standard deviation. However, this can be approximated to a normal distribution.

Quantile-Quantile Normal Probability Plot

Figure 3: A Q-Q Normal probability plot to test normality of electricity costs for 50 one-bedroom apartments

Figure 3 shows the individual scatter dots of the different electricity costs from 50 apartments. The scatter dots seems to lie in a virtual diagonal reference line in the middle and they faintly curve off in the extremes forming a slight S shape. In addition, the upper tail is slightly bending to the right of the virtual diagonal reference line while the lower tail bends slightly to the left of the line. This would suggest that the data set has quantiles comparable to those of a perfectly normal distribution with the same mean and standard deviation. However, the short tails evident in the figure suggest that the data exhibits less variability than expected in a normal distribution.

Conclusion

From the data characteristics which represent the most basic indicator of normality to graphical representation of the data, the results indicate that the data is symmetrical and normally distributed. However, it is worth noting that the data does not reflect a perfect normal distribution which requires that the mean and median be equal and the skewness and kurtosis be zero among other characteristics.  It contains minimal variations which have been approximated to assume the normal distribution characteristics. In conclusion the data has a minimum value of $82, a median value of $148.5 and a maximum value of $213. Additionally, the electricity costs are normally distributed with mean, µ = 147.06 and standard deviation σ = 4.48.

References

Ramachandran, K.M., and Chris P. Tsokos. 2009. Mathematical Statistics with Applications.

Burlington, MA, USA: Elsevier Academic Press.

[1]

K.M. Ramachandran, and Tsokos Chris P. 2009. Mathematical Statistics with Applications.

Burlington, MA, USA: Elsevier Academic Press.

[2]

K.M. Ramachandran, and Tsokos Chris P. 2009. Mathematical Statistics with Applications.

[3] Ibid.

September 25, 2023
Category:

Science

Subcategory:

Physics

Subject area:

Electricity

Number of pages

6

Number of words

1446

Downloads:

38

Writer #

Rate:

5

Expertise Electricity
Verified writer

LuckyStrike has helped me with my English and grammar as I asked him for editing and proofreading tasks. When I need professional fixing of my papers, I contact my writer. A great writer who will make your writing perfect.

Hire Writer

This sample could have been used by your fellow student... Get your own unique essay on any topic and submit it by the deadline.

Eliminate the stress of Research and Writing!

Hire one of our experts to create a completely original paper even in 3 hours!

Hire a Pro

Similar Categories