Exploratory Data Analysis (EDA)
Demonstrate the range, summary,
mean, variance, median, standard deviation, histogram, box plot, scatter plot
using population dataset.
Exploratory
Data Analysis (EDA)is the process of analyzing and visualizing the data to get
a better understanding of the data and glean insight from it. There are various
steps involved when doing EDA but the following are the common steps that a
data analyst can take when performing EDA:
Ø Import the data
Ø Clean the data
Ø Process the data
Ø Visualize the data
library(ggplot2)
#data
importing
my.data<-read.csv("india_population.csv")
#
view data
View(my.data)
print(india_population$Population)
#process
data
mean(india_population$Population)
Output: mean(india_population$Population)
[1] 964957502
summary(india_population$Population)
Output:
summary(india_population$Population)
Min. 1st Qu. Median Mean 3rd Qu.
4.099e+08 6.421e+08 1.010e+09 9.650e+08 1.321e+09
Max.
1.380e+09
var(india_population$Population)
var(india_population$Population)
[1] 1.277892e+17
hist(india_population$Population,
main = "Indian Population")
boxplot(india_population$Population,
main = "Indian Population")
plot(india_population$Population,
main = "Indian Population", col="red")
No comments:
Post a Comment