STATISTICS FOR ECONOMICS
Unit I
Chapter 1: INTRODUCTION
IMPORTANT CONCEPTS:
1. Meaning of Statistics in plural sense – It is a
collection of numerical facts.
2. Meaning of Statistics in Plural Sense – It deals with the
collection, presentation, analysis
and interpretation of quantitative information.
3. Definition of statistics in Plural Sense – It means
aggregate of facts affected to a marked
extent of multiplicity of causes numerically expressed,
enumerated or estimated according
to reasonable standard of accuracy, collected in a
systematic manner for predetermined
purpose and placed in relation to each other.
4. Consumer – A person who buys goods and services for the
satisfaction of human wants.
5. Producer – A person who produces goods.
6. Service holder – A person who is working or in a job and
gets paid for it.
7. Service Provider – A person who gives services to others
for a payment.
8. Economic activity – Activities undertaken for monetary
gain or to earn income.
9. Economics is divided into three parts:
a] Consumption b] Production c] Distribution
10. Consumption: In consumption, we study wants, their
origin, nature and characteristics and
the laws governing them.
11. Production: It refers to all activities which are
undertaken to produce goods and services
for generation of income and satisfaction of wants.
12. Distribution: Economic activity which studies how income
generated from the production
process is distributed among the factors of production.
13. Data: Economic facts in terms of numbers.
14. Importance of Statistics:
Statistics is widely used in many fields.
a] Importance to the Government – Statistics is used in
administration and efficient
functioning of departments. It collects data to fulfill its
welfare objectives.
b] Importance of Statistics in Economics:
1] Statistics helps in making economic laws like law of
demand and concept of
elasticity.
2] It helps in understanding and solving economic problem.
3] It helps in studying market structure.
4] It helps in finding mathematical relations between
variables.
1 mark question:
1. Define statistics in plural form.
2. What is economic activity?
3. Who is a service provider?
4. Who is a consumer?
5. What is meant by production?
3 marks question (FAQ)
1. Define statistics in singular sense.
2. How is statistics important in Economics?
3. How is statistics important to the Government?
Chapter 2: COLLECTION OF DATA
Points to remember:
1. Collection of data is the first important aspect of
statistical survey.
2. Data – Information which can be expressed in numbers.
3. Two sources of data – Primary & Secondary Primary
data – data collected by investigator
himself , secondary data – data collected by someone and
used by the investigator.
4. Difference between Primary and Secondary Data
a] Primary data is original data collected by the
investigator while secondary data is
already existing and not original.
b] Primary data is always collected for a specific purpose
while secondary data has
already been collected for some other purpose.
c] Primary is costlier or is more expensive whereas
secondary data is less expensive.
5. Methods / Sources of Collection of Primary Data :
a] Direct Personal Interview – Data is personally collected
by the interviewer.
b] Indirect Oral Investigation – Data is collected from
third parties who have
information about subject of enquiry.
c] Information from correspondents – Data is collected from
agents appointed in the
area of investigation.
d] Mailed questionnaire – Data is collected through
questionnaire [list of questions]
mailed to the informant.
e] Questionnaire filled by enumerators – Data is collected
by trained enumerators who
fill questionnaires.
f] Telephonic interviews – Data is collected through an
interview over the telephone
with the interviewer.
Questionnaire – A list of questions with space for answers.
Pilot Survey –
Try-out of the questionnaire on a small group to find its short comings.
7. Qualities of a good questionnaire :
a] A covering letter with objectives and scope of survey.
b] Minimum number of questions.
c] Avoid personal questions.
d] Questions should be clear and simple.
e] Questions should be logically arranged.
8. Difference between census method and sampling method.
Census Method
1) Every unit of population studied
2) Reliable and accurate results
3) Expensive method
4) Suitable when population is of
homogenous nature.
Sampling Method
1)Few units of population are studied
2) Less Reliable and accurate results
3) Less expensive method
4) Suitable when population is of
heterogeneous nature
9. Personal Interview Method :
Advantages
1) Highest response rate
2) Allows all types of questions
3) Allows clearing doubts regarding
Questions.
Disadvantages
1) Most expensive
2) Informants can be influenced
3) Takes more time
Mailed Questionnaire Method:
Advantages
1) Least expensive
2) Only method to reach remote areas
3) Informants can be influenced
Disadvantages
Long response time
Cannot be used by illiterates.
Doubts cannot be cleared regarding
questions
Telephonic Interview Method:
Advantages
1) Relatively low cost
2) Relatively high response rate
3) Less influence on informants
Disadvantages
Limited use
Reactions cannot be watched
Respondents can be influenced
Census Method – Data collected from each and
every unit of population.
Sample Method – Data is collected from few
units of the population and result is applied
to the whole group.
Sources of Secondary Data:
1. Published Source – Government publications,
Semi-government publications etc.
2. Unpublished Source – Census of India [They are collected
by the organizations for their own
record]
Sampling Methods: 1] Random sampling 2] Non-random sampling
1. Random Sampling – It is a sampling method in which all
the items have equal chance of
being selected and the individuals who are selected are just
like the ones who are not
selected.
2. Non-random sampling – It is a sampling method in which
all the items do not have an
equal chance of being selected and judgment of the
investigator plays an important role.
Types of Statistical errors:
1] Sampling errors 2] Non-sampling errors
Sampling Error: It is the difference between sample value
and actual value of a
characteristic of a population.
Non-sampling errors: Errors that accurate the stage of
collecting data.
Types of non-sampling errors:
a] Errors of measurement due to incorrect response.
b] Errors of non-response of some units of the sample
selected.
c] Sampling bias occurs when sample does not include some
members of the target
population.
Census of India – It provides complete and continuous
demographic record of population.
National Sample Survey Organization – It conducts national
surveys on socio-economic
issues.
Sarvekshana – Quarterly journal published by NSSO.
1 mark question:
1. What are the main sources of data?
2. Which of the two types of data are collected for a
definite purpose?
3. Which type of data involves less time and is less
expensive?
4. Name 2 sources of errors in data collection.
5. Name 2 agencies at national level that deals with the
collection, tabulation of statistical
data.
6. What is pilot survey?
7. Define sampling error.
8. Name 2 examples of secondary data.
9. Which method is used for estimation of population?
10. Name the journal published by NSSO.
3 mark questions:
1. Which of the following methods give better results and
why ?
a] sample b] census
[Hint: depends on survey objective; census useful when
population size is small]
2. Which of the following errors is more serious and why?
a] Sampling error b] Non sampling error
[Hint: Non sampling errors are more serious as sampling
errors can be minimized by
taking a larger sample]
3. Distinguish between primary data and secondary data.
4 mark questions:
1. Which of the following methods gives better results and
why?
a] Census b] Sample
2. Write four differences between census and sample methods.
3. What are the advantages of mailing questionnaire?
4. Distinguish between random and non random sampling.
6 mark questions:
1. Write 3 advantages and disadvantages each of indirect
oral investigation.
2. Distinguish between:
a] Primary data and Secondary data
b] Census method and Sample method
3. Distinguish between primary data and secondary data.
Which data is more reliable and
why?
4. What do you mean by questionnaire? State five principles
which should be followed while
drafting a good questionnaire.
5. Discuss the method of collecting data through
questionnaires filled by enumerators. Also
give its two merits and two demerits.
Chapter 3: Organization of Data
1. Classification of Data: The process of grouping data
according to their characteristics is
known as classification of data.
2. Objectives of Classification:
a] To simplify complex data
b] To facilitate understanding
c] To facilitate comparison
d] To make analysis and interpretation easy.
e] To arrange and put the data according to their common
characteristics.
3. Statistical Series: Systematic arrangement of statistical
data
I. Can be on the basis of individual units :
The data can be individually presented in two forms:
i] Raw data: Data collected in original form.
ii] Individual Series: The arrangement of raw data
individually. It can be expressed in
two ways.
a] Alphabetical arrangement : Alphabetical order
b] Array: Ascending or descending order.
II. Can be on the basis of Frequency Distribution:
Frequency distribution refers to a table in which observed
values of a variable are
classified according to their numerical magnitude.
1. Discrete Series: A variable is called discrete if the
variable can take only some
particular values.
2. Continuous Series: A variable is called continuous if it
can take any value in a given
range. In constructing continuous series we come across
terms like:
a] Class : Each given internal is called a class e.g., 0-5,
5-10.
b] Class limit: There are two limits upper limit and lower
limit.
c] Class interval: Difference between upper limit and lower
limit.
d] Range: Difference between upper limit and lower limit.
e] Mid-point or Mid Value: Upper limit - Lower limit/2
f] Frequency: Number of items [observations] falling within
a particular class.
i] Exclusive Series: Excluding the upper limit of these
classes, all the items of the class
are included in the class itself. E.g., :
Marks 0-10 10-20 20-30 30-40
Number of Students 2 5 2 1
ii] Inclusive Series: Upper class limits of classes are
included in the respective classes.
E.g.,
Marks 0-9 10-19 20-29
Number of Students 2 5 2
Open End Classes : The lower limit of the first class and
upper limit of the last class are
not given. E.g.,
Marks Below 20 20-30 30-40 40-50 50 and
above
Number of Students 7 6 12 5 3
iii] Cumulative Frequency Series: It is obtained by
successively adding the frequencies
of the values of the classes according to a certain law.
a] ‘Less than’ Cumulative Frequency Distribution :
The frequencies of each class-internal are added
successively.
b] ‘More than’ Cumulative Frequency Distribution:
The more than cumulative frequency is obtained by finding
the cumulative totals of
frequencies starting from the highest value of the variable
to the lowest value.
E.g., :
Marks No.
of students
0-10
2
10-20 5
20-30 10
30-40 12
40-50 17
50-60 4
Less than 10 2
Less than 20 7
Less than 30 17
Less than 40 29
Less than 50 46
Less than 60 50
More than 0 50
More than 10 48
More than 20 43
More than 30 33
More than 40 21
More than 50 4
1 mark questions :
1. What is meant by classification of data?
2. What is meant by discrete series?
3. What is meant by inclusive series?
3 mark questions:
1. Distinguish between Exclusive series and inclusive
series.
2. Distinguish between discrete series and continuous
series.
4 mark questions:
1. Construct a frequency distribution table for the
following marks of 30 students in the form
of a continuous
series according to exclusive method.
12 33 23 25 18 35 37 49 54 51
37 15 37 15 33 42 45 47 55 69
65 63 46 29 18 37 46 59 29 35
45 27
Chapter 4: PRESENTATION OF DATA
Important terms and concepts:
1. Tabulation – Orderly arrangement of data in rows and
columns.
2. Objectives of Tabulation:
a] Helps in understanding and interpreting the data easily.
b] It helps in comparing data.
c] It saves space and time.
d] Tabulated data can be easily presented in the form of
diagrams and graphs.
3. Main parts of a table.
a] Title of the table – It is a brief explanation of
contents of the table.
b] Table number – It is given to be used for reference.
c] Captions – A word or phrase which explains the content of
a column of a table.
d] Stubs – Stubs explain contents of row of a table.
e] Body of the table: Most important part of table as it
contains data.
f] Head note: Head note is inserted to convey complete
information of title.
g] Source note refers to the source from which information
has been taken.
h] Foot note: It is used for pointing exceptions to the
data.
Types of Table:
1. Simple Table – data are presented according to one
characteristic only.
2. Double Table – data are presented about two interrelated
characteristics of a particular
variable.
3. Three way table – This table gives information regarding
three interrelated characteristics
of a particular variable.
4. Manifold table – This table explains more than three
characteristics of the data.
Diagrammatic Presentation of Data
Utility or uses of diagrammatic presentation:
1. Makes complex data simple.
2. Diagrams are attractive.
3. Diagrams save time when compared to other methods.
4. Diagrams create a lasting impression on the minds of
observers.
Limitations of diagrammatic presentation:
1. They do not provide detailed information.
2. Diagrams can be easily misinterpreted.
3. Diagrams can take much time and labour.
4. Exact measurement is not possible in diagrams.
Kinds of diagrams:
I. Line diagrams – Lines are drawn vertically to show large
number of items.
II. Bar diagram
1. Simple Bar diagrams – These diagrams represent only one
particular type of data.
2. Multiple Bar diagrams – These diagrams represent more
than one type of data at a time.
3. Subdivided Bar diagram or Component Bar diagram – These
diagrams present total values
and parts in a set of a data.
III. Pie diagrams – Circle may be divided into various
sectors representing various
components.
GRAPHIC PRESENTATION OF DATA
Advantages of Graphic Presentation:
1. Graphs represent complex data in a simple form.
2. Values of median, mode can be found through graphs.
3. Graphs create long lasting effect on people’s mind.
Disadvantages of graphic Presentation
1. Graphs do not show precise values.
2. Only experts can interpret graphs.
3. Graphs may suggest wrong conclusions.
Rules of Constructing graph:
1. The heading of the graph should be simple, clear and self
explanatory.
2. Graphs should always be drawn with reference to some
scale.
3. False baselines should be drawn if the difference between
zero and the smallest value is
high.
4. Index should be made if different lines are drawn as in
time series graphs.
Types of Graphs:
1. Line frequency graphs – Such graphs are used to represent
discrete series.
2. Histogram – A two dimensional diagram whose length shows
frequency and the breadth
shows size of class interval.
Frequency Polygon: A histogram becomes frequency polygon
when a line is drawn
joining midpoints of tops of all rectangles in a histogram.
Frequency Curve: Smooth curve joining the points
corresponding to the frequency and
provides frequency curve of the data.
Ogive : A curve obtained by plotting frequency data on the
graph paper.
1 mark questions:
1. Give the meaning of tabulation.
2. What is the heading of rows called?
3. When should false base line be used?
4. Which graph can be used to find value of median? [Hint:
ogives]
5. What is histogram?
6. What is double table?
3 mark questions:
1. State three rules of drawing a table.
2. Represent the following data with Histogram
Wages 0-10 10-20 20-30 30-40 40-50 50-60
No. of Workers 5 12 8 30 15 8
3. Construct histogram from the following:
Midpoints 5 15 25 35 45 55
Frequencies 6 12 23 30 16 8
4. Prepare a blank table to show
1] Year : 2004, 2005
2] Faculty : Arts, Science, Commerce
3] Gender : Male, Female
5. Represent the following using pie diagram
Items of Expenditure Amount
spent
Food 40
Clothing 20
Fuel and lighting 50
House Rent 70
Miscellaneous 20
6 marks questions :
1. Construct less than and more than ogive :
X 20-40 40-60 60-80 80-100
f 3 7 11 9
2. Draw less than and more than ogive :
Profits 10-20 20-30 30-40 40-50 50-60 60-70
No. of
Companies 4 7
10 20 17
2
3. Make histogram and frequency polygon from :
Class 0-20 20-40 40-60 60-80 80-100
Frequency 10 4 6 14 16
4. Distinguish between frequency polygon and frequency curve
through an example.
5. Discuss the difference between simple table and complex
table. Use example.
Chapter 5: Measures of Central
Tendency
Important Term and Concepts:
1. Average: It is a value which is typical or representative
of a set of data.
Averages are also called Measures of Central Tendency.
2. Functions of Average:
i] Presents complex data in a simple form.
ii] Facilitates comparison.
iii] Helps government to form policies.
iv] Useful in Economic analysis.
3. Essentials of a good Average:
i. Simple to calculate.
ii. It should be easy to understand.
iii. Rigidly defined.
iv. Based on all items of observation.
v. Least affected by extreme values.
vi. Capable of further algebraic treatment.
vii. Least affected by sampling fluctuation.
viii. Graphic measurement possible.
4. Types of Averages:
i. Arithmetic Mean
ii. Median
iii. Mode
iv. Quartiles
5. Arithmetic Mean (X)
It is the most common type of measures of central tendency.
It is obtained by dividing the sum of all observation in a
series by the total number of
observation.
7. Merits of Arithmetic Mean:
1] Easy to calculate
2] Simple to understand
3] Based on all observations
4] Capable of further mathematical calculations.
Demerits :
1] Affected by extreme values.
2] Cannot be calculated in open-end series.
3] Cannot be graphically ascertained.
4] Sometimes misleading or absurd result.
Median (M)
It is defined as the middle value of the series, when the
data is arranged in ascending or
descending order.
Merits
1. Easy to understand and easy to compute.
2. Not underly affected by extreme observation.
3. It can be located graphically.
4. Appropriate average in case of open end classes.
Demerits:
1. Not based on all observations.
2. It requires arrangement of data.
3. Not capable o further algebraic treatment.
Quartiles:
It divides the data into four equal parts.
There are three Quartiles – Q1, Q2, Q3
Q2 is called Median.
Mode (Z)
It is the value which occurs the most frequently in a
series.
Calculation of Mode
i. Individual Series :
ii. By observation identify the value that occurs most
frequently in a series.
iii. By conversion into discrete series and then identify
the value corresponding to which
there is highest frequency.
Discrete Series:
i. By Inspection Method.
ii. Grouping Method: By preparing Grouping Table and then
preparing Analysis table.
Continuous Series:
i. Determination of Modal class by Inspection Method or
Grouping table and Analysis
table.
ii. Applying the formula
Merits of Mode
i. It is easy to understand and simple to calculate.
ii. Not affected by extreme values.
iii. Can be located graphically.
iv. Easily calculated in case of open-end classes.
Demerits of Mode
i. Not rigidly defined.
ii. If mode is ill defined, mathematical calculation is
complicated.
iii. Not based on all items.
iv. Not suited to algebraic treatment.
12. Relationship between Mean Median and Mode
i. In case of symmetrical distribution
Mean = Median = Mode
ii. In case of asymmetrical distribution
Mode = 3 Median – 2 Mean
1 mark questions:
1. Define an average.
2. Define mode.
3. Age of 5 students is 22, 24, 26, 21, 20. Find the modal
age.
4. What is the relationship of Mean, Median and Mode in an
asymmetrical distribution?
3 marks questions:
1. Calculate the Mean & Median from the following data :
Marks 10-20 20-30 30-40 40-50 50-60 60-70
No. of Students 5 5 5 20 10 5
2. Calculate Mode from the following data.
Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of Students 2 5 8 10 8 5 2
4 mark questions:
1. Mention any 2 Merits and Demerits each of Arithmetic
Mean.
2. What are the requisites of a good average?
Chapter 6: Measures of Dispersion
1. Dispersion refers to the variation of the items around an
average.
2. Objectives of Dispersion
a) To determine the reliability of an average.
b) To compare the variability of two or more series.
c) It serves the basis of other statistical measures such as
correlation etc.
d) It serves the basis of statistical quality control.
Properties of good measure of Dispersion
a) It should be easy to understand.
b) Easy to calculate.
c) Rigidly defined
d) Based on all observations.
e) Should not be unduly affected by extreme values.
Measures of Dispersion may be either absolute measures or
relative measure.
Absolute Measures of Dispersion are
a) Range
b) Quartile Deviation
c) Mean Deviation
d) Standard Deviation
Relative Measures of Dispersion are
a) Coefficient of Range
b) Coefficient of Quartile Deviation
c) Coefficient of Mean Deviation
d) Coefficient of Variation
Graphical method of dispersion
Lorenz Curve
Range
It is the difference between the largest and smallest value
of distribution.
Computation of Range
Range = L – S
Coefficient of Range =
L- S/ L+S
Merits of Range
1. It is simple to understand and easy to calculate.
2. It is widely used in statistical quality control.
Demerits of Range
1. It is affected by extreme values in the series.
2. It cannot be calculated in case of open end series.
3. It is not based on all items.
Inter quartile range and quartile deviation
Inter quartile range is the difference between Upper
Quartile (Q3) and Lower Quartile Q1.
Quartile deviation is half of inter quartile range.
Computation of Inter quartile range and quartile deviation
Inter quartile Range = Q3 – Q1
Quartile Deviation Q.D = Q3-Q1/2
Coefficient of Q.D = Q3-Q1/ Q3+Q1
Merits of Q.D
1. Easy to compute
2. Less affected by extreme values.
3. Can be computed in open ended series.
Demerits of Q.D
1. Not based on all observations
2. It is influenced by change in sample and suffers from
instability.
Mean Deviation
Mean Deviation is defined as the arithmetic average of the
absolute deviations [ignoring signs]
of various items from Mean or Median.
Computation of Mean Deviation
Merits of Mean Deviation
1. Based on all observations.
2. It is less affected by extreme values.
3. Simple to understand and easy to calculate.
Demerits of Mean Deviation
1. It ignores ± signs in deviations.
2. It is difficult to compute when deviations comes in
fractions.
Standard Deviation:
It is defined as the root mean square deviation.
Features of Standard Deviation:
1. Value of its deviation is taken from Arithmetic Mean.
2. + and – signs of deviations taken from mean are not
ignored.
Merits of Standard Deviation
i. Rigidly defined
ii. Based on STATISTICS FOR ECONOMICS