Plotting in R (GGPLOT2)
First we load the iris data and see the head of the data
data("iris")
head(iris)
attach(iris)
table(Species)
## Species
##     setosa versicolor  virginica 
##         50         50         50

Basic R plotting

# Scatter plot to see sepal length vs petal length
# The value column on left side plotted in x axis, and the column name given in right side will be plotted in y axis
plot(Petal.Length, Sepal.Length)

#Here xlab means label on the x axis
# main is the title of the plot and col = color of the plot
#change pch like 1, 2 3 and so on. You will see different shapes, size, fill etc.
plot(Petal.Length, Sepal.Length, xlab = "Petal Length", ylab = "Sepal Length", col="red", main = "Petal Length vs Sepal Length", pch = 2)

##### Notice this type of scatter_plot can be are reffered as bivariate analysis, as here we deal with two variables ##### When we analyze multiple variable, is called multivariate analysis and analyzing one variable called univariate analysis. e.g: looking for mean, count, meadian, range or hostogram etc.


Lets’s plot some histogram

hist(Sepal.Width)

#### Yoo also can add some color and xlabel, y label and title
hist(Sepal.Width, xlab = "Sepal Width", ylab = "Frequency", main = "Histogram of Sepal Width", col ="orange")

#### Let’s do some boxplot

With boxplot we can see the median, IQr,outliers
And the best part is we can see how a continuous variable change with respect to categorical variables (2nd plot)
boxplot(Sepal.Width)

#You also can add some color, label etc.
#And aslo can see the difference between differnt species
boxplot(Sepal.Width~Species, main = "Boxplot", col = "skyblue")

Here the middle line inside the boxplot denotes the median value of the data
Circle over/under the boxplot denotes the outliers
And upper limit line denotes the 75th percentile and lower part denotes the 25th percentile, that is also known as interquantile range

Lets move to GGPLOT2

Here GG stand for graphics of grammar.

GGplot is a layered approach. FIrst layer will add the plot size, second layer will plot something on this,

First let’s include the library
library(ggplot2)
##### In geoom_point it could have color of the points
ggplot(data = iris, aes(y = Sepal.Length, x = Petal.Length)) +geom_point(col = "green")

# we can add more layer to this
# Here geom point means we want a scatter plot. 
##In aestetchic it could have x value, y value, color, shape and size
# Remove geom point, nothing will be shown. 

#---------------------------------one layer------------------------------- + another layer
ggplot(data = iris, aes(y = Sepal.Length, x = Petal.Length, col= Species)) +geom_point()

# We also can plot the graph based on different shape for different species
ggplot(data = iris, aes(y = Sepal.Length, x = Petal.Length, shape = Species)) +geom_point()

# Both together
ggplot(data = iris, aes(y = Sepal.Length, x = Petal.Length, col= Species, shape = Species)) +geom_point()

Frequency _ polygon

# freq poly is like connecting all the histogram's top with a line
ggplot(data = iris, aes(x = Sepal.Width))+geom_freqpoly(col = "blue", bins = 10)

Histogram

ggplot(data = iris, aes(x = Sepal.Width))+geom_histogram(col = "blue", fill = "skyblue")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

# we also can change the bin size of the histogram
# by default it is 30


ggplot(data = iris, aes(x = Sepal.Width))+geom_histogram(bins = 10, col = "blue", fill = "skyblue")

library(vcdExtra)
## Loading required package: vcd
## Loading required package: grid
## Loading required package: gnm
data("Arthritis")
head(Arthritis)
ggplot(data = Arthritis, aes(x = Age))+geom_histogram(bins = 10, col = "blue", fill = "skyblue")

# lets add a new column
Arthritis$Binary_Result <- ifelse(Arthritis$Improved == "None", 0, 1)
head(Arthritis)
Arthritis$Binary_Result <- factor(Arthritis$Binary_Result)

#Here 0 represent not improved. and 1 represent improved
ggplot(data = Arthritis, aes(x = Age, fill = Binary_Result))+geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Bar Graph

# we found that this dataset contains more female than male
ggplot(data = Arthritis, aes(x= Sex)) +geom_bar()

# If we want to see improved condtion for the both gender
# wee see that if the person is female the improved_condition ratio is high
ggplot(data = Arthritis, aes(x= Sex, fill = Binary_Result)) +geom_bar()

# We also want to see the placebo and treatment effect on Arthritis
# here we see treatment has some more effect rather then placebo

ggplot(data = Arthritis, aes(x= Treatment, fill = Binary_Result)) +geom_bar()

Box plot to see chnage of continuous variable interms of categorical variables

Lets learn this part with mtcars dataset

#View(mtcars)

# we see as gears number increases from 3,  the millage per gallon also increases
# car with 3 gear is less efficiency (interns of millage per gallon) than other two (car with 4th anf fifth gear)
boxplot(mtcars$mpg~mtcars$gear, main = "Boxplot", col = "skyblue")

#lets do same thing with ggplot2, along with another variable
# This stuff called multivariate analysis (analyzing more then 2 variable)

# we see that car with fourth gear doesnot have 8 cylinder
# car with 3rd gear do not have any 4 cylinder and so on
ggplot(data = mtcars, aes(x = factor(gear), y = mpg, fill = factor(cyl) )) + geom_boxplot()

Geom Sommoth plotting

#Lets create a dataset for geom_smooth example
a <- c(1, 2, 3, 4, 5, 6, 7)
b<-c(10, 22, 34, 55, 60, 65, 79)
new_data <- data.frame(a, b)
head(c)
##                  
## 1 .Primitive("c")
ggplot(data = new_data, aes(x = a, y = b) )+geom_point()

#geom_smooth connects the dots in a smooth manner
# grey area is error

ggplot(data = new_data, aes(x = a, y = b) )+geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

#if we add the method as lm, that will convert the graph in a linear model , will be a straight line
ggplot(data = new_data, aes(x = a, y = b) )+geom_smooth(method = "lm")
## `geom_smooth()` using formula 'y ~ x'

# if we want to plot the point and smooth function together 
# and if you change the theme there are several themes like theme_bw(), theme_dark() and so on
ggplot(data = new_data, aes(x = a, y = b) )+geom_point()+geom_smooth(method = "lm") +theme_bw()
## `geom_smooth()` using formula 'y ~ x'

Thank you for your time. If any query or suggestion please comment below. :)