It's not clear how you would like to plot a model with so many dependents. Plotting the results of your logistic regression Part 2: Continuous by continuous interaction. Save my name, email, and website in this browser for the next time I comment. }, 0); Keep in mind, we already turned Gender into a Factor with labeled levels, so we can refer to the actual names of the levels (instead of numbers). In what happens if my dog eats tomcat mouse poison Ploting interaction plot in ggplot using +1sd/-1sd following logistic regression. plot multiple roc curves r ggplot. How can I ggplot a logistic function correctly using predict or inv.logit? The data is from the package {MASS} and the table that contains the full data set is called (leuk). For example, you may not actually have any cases with X2 value 1SD above the mean, in which case maybe you just want to put in max(X2) for the high case instead. Why don't math grad schools in the U.S. use entrance exams? This form is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply. All males in the data set are assigned a 0 and all females are assigned a 1. The argument method of function with the value "glm" plots the logistic regression curve on top of a ggplot2 plot. There are multiple methods for producing this plot. For the final example of the chapter, we are going to look at plotting interactions with 2 categorical predictors. Use stargazer() to visualize your results. R line graphs, values outside plot area. Applications of Logistic Regression with R It helps in image segmentation and categorisation. The line that is drawn diagonally to denote 50-50 partitioning of the graph. $('tr.header').parent('thead').parent('table').addClass('table table-condensed'); Using ggplot2 Here the above exercise is repeated with the same data, but using the ggplot2 R package to display the results and run the regressions. What are the weather minimums in order to take off under IFR conditions? . This time, however,we need to do this for BOTH predictor variables (gender & tutor) because we have 2 categorical variables. For the next 3 methods, we are going to specify the centered Work Ethic IV to range from -2.5 to 2.5, increasing by .5, but for the centered IQ IV, we will show 3 different theoretical ways to choose our levels. Does a beard adversely affect playing the violin or viola? For even more ggplot fun, refer to Chapter 10 or this awesome ggplot Cheat Sheet. can you redo the solution using the full data set? In our example this translates to the probability of a county . Again, well put X1 on the x-axis. In general, does anyone have any tips for visualizing a logit model with multiple predictors? . library(ggplot2) #plot logistic regression curve ggplot (mtcars, aes(x=hp, y=vs)) + geom_point (alpha=.5) + stat_smooth (method="glm", se=FALSE, method.args = list (family=binomial)) Note that this is the exact same curve produced in the previous example using base R. Feel free to modify the style of the curve as well. 1. It's actually far simpler to do this with ggplot: However, to recreate your target plot in base R graphics, you could do something like: Now for the main caveat: since you already have the raw survival times, you should probably run this as a survival analysis, not as logistic regression, since you have lost a lot of statistical power by converting to a binary outcome. This time, well use the same model, but plot the interaction between the two continuous predictors instead, which is a little weirder (hence part 2). This code can take a few minutes to run, which is why I have not included it in the coded section of this chapter. The syntax in R to calculate the coefficients and other parameters related to multiple regression lines is : var <- lm (formula, data = data_set_name) summary (var) lm : linear model. Not good! However, plotting a binary response this way, even with jitter, is not very appealing. Is there an industry-specific reason that many characters in martial arts anime announce the name of their attacks? The tutorial covers ggplot2 and ggpubr packages for visualization and tidyr and dplyr packages for data wrangling. Interesting! the whole range for X1, group a, and the representative values we picked for X2), and then when you run predict() on it, for each row in the data frame it will generate the predicted value for your DV from the model you saved. Im sorry, I realized I missed a chunk of the post! Typical choices are high (1SD above the mean), medium (the mean), and low (1SD below the mean) X2. This tutorial will cover some aspects of plotting modeled data within the context of multilevel (or 'mixed-effects') regression models. If the curve is more close to the line, lower the performance of the classifier, which is no better than a mere random guess. If anyone wants to learn data visualization and data manipulation using R, then I have made some tutorials for it. And, most importantly, less typing which means fewer errors. Plot time! If you use the ggplot2 code instead, it builds the legend for you automatically. To subscribe to this RSS feed, copy and paste this URL into your RSS reader. It's not really possible to give a working example as an answer without a bit more sample data than this. I Denote p k(x i;) = Pr(G = k |X = x i;). Making statements based on opinion; back them up with references or personal experience. Hot Network Questions By clicking Accept all cookies, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy. }. This is my first time trying to plot a logistic model in R, and I'm not sure how to go about visualizing the results. - MrFlick 2. ggplot2 show legends with geom_abline and geom_smooth. Logistic regression is a technique used in the field of statistics measuring the difference between a dependent and independent variable with the guide of logistic function by estimating the different occurrence of probabilities. Logistic regression models in ggplot2 [duplicate] Ask Question Asked 5 years, 8 months ago. Correctness of logistic regression in Vowpal Wabbit? Here's a nice tutorial . Does English have an equivalent to the Aramaic idiom "ashes on my head"? It's free to sign up and bid on jobs. Previously, I wrote that R dummy codes automatically. Why should you not leave the inputs of unused gates floating with 74LS series logic? What is the use of NTP server when devices have accurate time? Whatever you decide, I recommend checking to make sure the representative values youre plugging in actually make sense given your data. The result is a logit-transformed probability as a linear relation to the predictor. This upcoming section is going to look at how you would run/plot a regression with 1 continuous predictor variable and 1 categorical predictor variable. Last time, we ran a nice, complicated logistic regression and made a plot of the a continuous by categorical interaction. When running a regression in R, it is likely that you will be interested in interactions. ggplot2: Logistic Regression - plot probabilities and regression line. Asking for help, clarification, or responding to other answers. Did the words "come" and "home" historically rhyme? rev2022.11.7.43014. For basic statistics, there are about 9 small chapters covering concepts like distributions, p-value, variance, statistics tests (parametric and non-parametric), and . What's the proper way to extend wiring into a replacement panelboard? It works using the ggplot () function specifying the data to be plotted, and you can then add features to the plot using functions like geom_point (). What are some tips to improve this product photo? These are the default settings with respect to all aesthetic elements. The expand.grid() function is a quick and easy way to make a data frame out of all possible combinations of the variables provided. 503), Fighting to balance identity and anonymity on the web(3) (Ep. First, load some packages. For a full APA style graph, error bars would be expected. Edit Since your response variable is binary, it might not really be interesting to plot it on one of your axes. Will it have a bad influence on getting a student visa? You can also do 4 or 5 lines instead of just 3, if you want. Next, compute the equations for each line in logit terms. I Given the rst input x 1, the posterior probability of its class being g 1 is Pr(G = g 1 |X = x 1). In this vid, we look at how to PLOT PREDICTED PROBABILITIES USING GGPLOT2 for LOGIT REGRESSION IN R! Why doesn't this unzip all my files in a given directory? 2018-04-02 In this post we are plotting an interaction for a logistic regression. window.buildTabsets("TOC"); You may be wondering where I got these funky letter/number combinations that translate into colors. More accurate calculations of mean/error/etc. Here's an example of how to do it with the ggplot2 package When the migration is complete, you will access your Teams at stackoverflowteams.com, and they will no longer appear in the left sidebar on stackoverflow.com. The important thing to show is the proportion of Zwetstimme at each of the 5 levels of atomenergie. IQ x Gender (Male/Female) as predictors of GPA. The logistic regression model can be presented in one of two ways: l o g ( p 1 p) = b 0 + b 1 x. or, solving for p (and noting that the log in the above equation is the natural log) we get, p = 1 1 + e ( b 0 + b 1 x) where p is the probability of y occurring given a value x. ** Note - the error bars in this graph are very hard to see, because we have very little error in our simulated data set. Youll need to actually calculate the predicted probabilities yourself. Plotting Predicted Probabilities with Categorical Data (logistic regression). Space - falling faster than light? Search for jobs related to Plot logistic regression in r ggplot2 or hire on the world's largest freelancing marketplace with 21m+ jobs. Now, in terms of what were learning from this graph - we can see the interaction effects a lot more clearly. Similar to the last example, we are going to now create factors with dummy codes. The approach towards plotting the regression line includes the following steps:- Create the dataset to plot the data points Use the ggplot2 library to plot the data points using the ggplot () function Use geom_point () function to plot the dataset in a scatter plot You have to enter all of the information for it (the names of the factor levels, the colors, etc.) $(document).ready(function () { To compute multiple regression lines on the same graph set the attribute on basis of which groups should be formed to shape parameter. quantile regression plots in r. fire emblem randomizer yune london to sheffield cheap train tickets. Plot logistic regression using parameters in ggplot2. I have figured it out! That would be a better for for. Thanks! I'm new to Stack Overflow and wasn't aware of Cross Validated. All the data needed to make the plot is typically be contained within the dataframe supplied to the ggplot () itself or can be supplied to respective geoms. What is this political cartoon by Bob Moran titled "Amnesty" about? Can plants use Light from Aurora Borealis to Photosynthesize? Great, thank you both for the input. In this vid, we look at how to PLOT PREDICTED PROBABILITIES USING GGPLOT2 for LOGIT REGRESSION IN R! If you are working with IQ, a drug, or age - numbers are relevant and are useful to pick! Concealing One's Identity from the Public When Purchasing a Home. Site design / logo 2022 Stack Exchange Inc; user contributions licensed under CC BY-SA. Multiple logistic regression analyses, one for each pair of outcomes: One problem with this approach is that each analysis is potentially run on a different sample. Counting from the 21st century forward, what is the last place on Earth that will get to experience a total solar eclipse? Now we will create your model. Site design / logo 2022 Stack Exchange Inc; user contributions licensed under CC BY-SA. Can customize every aspect of the graphs (color, size of text, data points), To avoid problems of multicollinearity! 504), Mobile app infrastructure being decommissioned. The second noticeable feature is that you can keep enhancing the plot by adding more layers (and themes) to an existing plot created using the ggplot () function. ---
title: 'Chapter 13: Plotting Regression Interactions'
author: "Callie Silver"
output:
  html_document:
    code_download: yes
    fontsize: 8pt
    highlight: textmate
    number_sections: yes
    theme: cerulean
    toc: yes
    toc_float:
      collapsed: no

---

```{r setup, include=FALSE}

knitr::opts_chunk$set(echo = TRUE)
```

# Basics 
The following chapter will include:

- Basic information about interactions and simple slopes 
- Background to plotting interactions in R 
- A real-life example 
- Code to simulate data set  
- Continuous X Continuous Regression: code and interpretation 
- Nominal X Continuous Regression: code and interpretation 
- Nominal X Nominal Regression: code and translation 

## What is an interaction?
**Interaction:** When the effect of one independent variable differs based on the level or magnitude of another independent variable 

* *y* = A + B + A*B 
    + ***y*** = dependent variable 
    + **A** = independent variable 
    + **B** = independent variabile 
    + **A*****B** = interaction between A and B 
    
For more information about interactions in regression:      
[Click here](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0ahUKEwiFrsq38YjTAhUBUCYKHQ9BDeAQFggsMAI&url=https%3A%2F%2Fwww.researchgate.net%2Ffile.PostFileLoader.html%3Fid%3D555e3ef65f7f710c7a8b45a3%26assetKey%3DAS%253A273781539442688%25401442286013206&usg=AFQjCNGAHVtFnQL3Pjp-lfBBmcCbF6vpkg&sig2=Pn2LABHMrV4Oxt0qpcnOgQ&cad=rja) for
*Jaccard & Turrisi 2003 * Interaction Effects in Multiple Regression

## What is a simple slope?

- A simple slope is a regression line at one level of a predictor variable 

- Think of simple slopes as the visualization of an interaction 

**How do we plot these things in R?...** 

## Interaction Plotting Packages 
When running a regression in R, it is likely that you will be interested in interactions. The following packages and functions are good places to start, but the following chapter is going to teach you how to make custom interaction plots. 

* **lm() function:** your basic regression function that will give you interaction terms 
* **stargazer package, stargazer() function:** pretty summary of regression results 
* **rockchalk package, plotSlopes() function:** quick and basic graph of simple slopes

**Why can't we just use these packages?...** 

## Benefits of Custom Interaction Plots 
#### Using **effects** and **ggplot2**

- Full control over what you're plotting (i.e. x level of x variable)
- More accurate calculations of mean/error/etc.
- Can enter in own values for confidence intervals, standard error bars, etc.
- Can customize every aspect of the graphs (color, size of text, data points)

We go from "quick & dirty" simple slope plots to "pretty & customizable" graphs

## The Packages You Need

If you don't already have these packages installed, use the following functions to do so: 
``` {r, message=FALSE, echo=TRUE}

#install.packages("car") #An extremely useful/in-depth regression package 
#install.packages("stargazer") #Produces easy to read regression results (similar to what you get in SPSS)
#install.packages("effects") #We will use this to create our interactions 
#install.packages("ggplot2") #Our incredibly powerful and versatile graphing package 

```
**One last thing before we get started...* 

If the words "interaction" or "linear model" are sounding a little foreign, check out [Chapter 12](http://ademos.people.uic.edu/Chapter12.html) for an awesome regression refresher!!

# Continuous x Continuous Regression
**IQ and Work Ethic as Predictors of GPA**

For all the examples in this chapter, we are actually going to simulate our own data. This eliminates the need for downloading a data set / calling in data. 

## Simulate your data 
  
``` {r, message=FALSE, echo=TRUE}
library(car) #Even though we already installed "car", we have to tell R we want it to load this package for us to use 	
#You can choose whatever # you want for the seed; this is for randomization of your data set
set.seed(150)
#Let's make our data set will have 250 participants (n), perhaps college students!     
n <- 250	
#Uniform distribution of work ethic (X) from 1-5 (1 = poor work ethic, 5 = great work ethic) 
X <- rnorm(n, 2.75, .75)	
#We want a normal distribution of IQ (Z)
#I fixed the mean of IQ to 15 so that the regression equation works realistically, SD = 15 
Z <- rnorm(n, 15, 15)	
#We then create Y using a regression equation (adding a bit of random noise)    
Y <- .7*X + .3*Z + 2.5*X*Z + rnorm(n, sd = 5)
#This code is here so that Y (GPA) is capped at 4.0 (the logical max for GPA)
Y = (Y - min(Y)) / (max(Y) - min(Y))*4
#Finally, we put our data together with the data.frame() function 
GPA.Data <- data.frame(GPA=Y, Work.Ethic=X, IQ=Z)	
```

## Center your independent variables 
**Why do we center our variables?**

- To avoid problems of multicollinearity! When a model has multicollinearity, it doesn't know which term to give the variance to ("You gave me 3 lines that are the same!" - angry model)
- When we center our IVs, the center of each IV represents the mean 
- When you interact X * Z, you are adding a new predictor (XZ) that strongly correlates with X and Z 
- If you center your variables, you will now have a U-shaped interaction that is orthogonal to X and Z
- **Exceptions:** Don't center physical data or when there is a true, meaningful 0 
```{r, message=FALSE, echo=TRUE}
GPA.Data$IQ.C <- scale(GPA.Data$IQ, center = TRUE, scale = FALSE)[,]
GPA.Data$Work.Ethic.C <- scale(GPA.Data$Work.Ethic, center = TRUE, scale = FALSE)[,]
```
## Run your regression models 
Use **lm() function** to run model with and without interaction

- Additive effects = + 
- Multiplicative (interaction) effects = * 

Use **stargazer()** to get a pretty, user-friendly chart of your results

```{r, message=FALSE, echo=TRUE,results='asis'}
GPA.Model.1 <- lm(GPA~IQ.C+Work.Ethic.C, GPA.Data)
GPA.Model.2 <- lm (GPA~IQ.C*Work.Ethic.C, GPA.Data)

library(stargazer)
stargazer(GPA.Model.1, GPA.Model.2,type="html",	
          column.labels = c("Main Effects", "Interaction"),	
          intercept.bottom = FALSE,	
          single.row=FALSE, 	
          notes.append = FALSE,	
          header=FALSE)	
```

## Plot your interaction 
When we are plotting the simple slopes of a continuous IV X continuous IV, we have to specify what levels of each we want to examine. There are 3 methods for choosing levels: **hand picking, quantiles, standard deviation**

For the next 3 methods, we are going to specify the centered Work Ethic IV to range from -2.5 to 2.5, increasing by .5, but for the centered IQ IV, we will show 3 different theoretical ways to choose our levels. 

### Plotting simple slopes: Hand Picking

- Hand picking is useful if you have specific predictions in your data set 
- If you are working with IQ, a drug, or age - numbers are relevant and are useful to pick! 
- For our example, let's go with -15, 0, 15 for our centered IQ (1 SD above and below mean) 
- **c()** will give you the exact values and **seq()** will give you a range from a to b, increasing by c 


```{r, message=FALSE, echo=TRUE}
library(effects)
#Run the interaction 
Inter.HandPick <- effect('IQ.C*Work.Ethic.C', GPA.Model.2,
                                              xlevels=list(IQ.C = c(-15, 0, 15),
                                              Work.Ethic.C = c(-1.1, 0, 1.1)),
                                              se=TRUE, confidence.level=.95, typical=mean)

#Put data in data frame 
Inter.HandPick <- as.data.frame(Inter.HandPick)

#Check out what the "head" (first 6 rows) of your data looks like
head(Inter.HandPick)
  
#Create a factor of the IQ variable used in the interaction                   
Inter.HandPick$IQ <- factor(Inter.HandPick$IQ.C,
                      levels=c(-15, 0, 15),
                      labels=c("1 SD Below Population Mean", "Population Mean", "1 SD Above Population Mean"))
                     
#Create a factor of the Work Ethic variable used in the interaction 
Inter.HandPick$Work.Ethic <- factor(Inter.HandPick$Work.Ethic.C,
              levels=c(-1.1, 0, 1.1),
              labels=c("Poor Worker", "Average Worker", "Hard Worker"))

library(ggplot2)                
Plot.HandPick<-ggplot(data=Inter.HandPick, aes(x=Work.Ethic, y=fit, group=IQ))+
      geom_line(size=2, aes(color=IQ))+
      ylim(0,4)+
      ylab("GPA")+
      xlab("Work Ethic")+
      ggtitle("Hand Picked Plot")


Plot.HandPick 
#In R, you have to "call for" your graphs after you make them in order to see them
                
#Code to save plot to your computer 
#ggsave("Plot.1.png", Plot.1,width = 5, height = 5, units = "in")
```

#### Interpretation of Hand Picked Plot
This plot here is an example of pretty much the simplest you can get with ggplot. These are the default settings with respect to all aesthetic elements. As we go through this chapter, I will give you bits of code that will help you make your graph prettier, more colorful, or better suited for publishing.

For even more **ggplot** fun, refer to [Chapter 10](http://ademos.people.uic.edu/Chapter10.html) or this awesome [ggplot Cheat Sheet](https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf)

In terms of what this graph is telling us, we can visualize the fact that for smart people (1 SD above the population mean (not determined by our data set), as their work ethic increases, so does their GPA. A similar pattern is seen for people with average IQs, though the effect is not nearly as strong. For people 1 SD below the population mean on IQ, as their work ethic increases, it appears as though their GPA actually decreases. Interesting! Maybe they get more confused with the material? Who knows! 

### Plotting simple slopes: Quantile 

- Let's use levels that are based on quantiles (bins based on probability)
- You can ask for as many or as few quantiles as you want
- Non-parametric; based on probability and does not assume normality of IV 
- For this example, let's ask for 5 quantiles and have them rounded to 2 decimal points 
```{r, message=FALSE, echo=TRUE}
#Make your new IQ variable that asks for quantiles  
IQ.Quantile <- quantile(GPA.Data$IQ.C, probs=c(0,.25,.50,.75,1))
IQ.Quantile <- round(IQ.Quantile, 2)
IQ.Quantile 

library(effects)
#Run your interaction
Inter.Quantile <- effect('IQ.C*Work.Ethic.C', GPA.Model.2,
                                      xlevels=list(IQ.C = c(-35.44, -9.78, -0.04, 9.89, 41.90),
                                      Work.Ethic.C = c(-1.1, 0, 1.1)),
                                      se=TRUE, confidence.level=.95, typical=mean)
#Put data into data frame
Inter.Quantile <- as.data.frame(Inter.Quantile)

#Create factors of the different variables in your interaction: 

Inter.Quantile$IQ<-factor(Inter.Quantile$IQ.C,
                      levels=c(-35.44, -9.78, -0.04, 9.89, 41.90),
                      labels=c("0%", "25%", "50%", "75%", "100%"))
                     
Inter.Quantile$Work.Ethic<-factor(Inter.Quantile$Work.Ethic.C,
              levels=c(-1.1, 0, 1.1),
              labels=c("Poor Worker", "Average Worker", "Hard Worker"))
```
**FUN WITH FONTS**

install.packages(extrafont) 

I did not include this package up front, as it is totally optional! 
If you want to play around with different font options, install this package and load it

After installation/loading, you will want to run the following code:
**font_import()**

This code can take a few minutes to run, which is why I have not included it in the coded section of this chapter. 

The last function you will need is: **fonts()** 
You will get a list of all the fonts accessible to you in R

```{r, message=FALSE, echo=TRUE, warning=FALSE}
library(extrafont)
#font_import() # run this line of coded here to install fonts
library(ggplot2) 
Plot.Quantile<-ggplot(data=Inter.Quantile, aes(x=Work.Ethic, y=fit, group=IQ))+
      geom_line(size=2, aes(color=IQ))+
      ylab("GPA")+
      xlab("Work Ethic")+
      scale_color_manual(values=c("#42c5f4","#54f284","#f45dcc",  
                             "#ff9d35","#d7afff"))+ #custom color coding 
      theme_bw()+ #deleting the gray background 
      theme(text = element_text(family="Impact", size=14, color="black"))+ #changing font!
      ggtitle("Quantile Plot") #adding a title! 

Plot.Quantile

```

#### Interpretation of Quantile Plot
So to recap the codes we learned in this plot, we now know how to change fonts, get rid of the gray background, add a title, and choose custom colors! 

You may be wondering where I got these funky letter/number combinations that translate into colors. If you google "html color picker", you can copy the color code of any color your heart desires! Pretty neat!! 

Now, in terms of what we're learning from this graph - we can see the interaction effects a lot more clearly. It seems as though all people in the 25th percentile or higher are experiencing some degreee of a positive relationship between work ethic and GPA. As work ethic and IQ increase, so does GPA! Unfortunately, for this group below the 25th percentile, there is a pretty clear negative relationship indicating that is their work ethic increases, their GPA actually decreases. Not good! 

### Plotting simple slopes: Standard Deviation 

- Lastly, let's choose our levels based on the standard deviation of the data 
- We can select values based on the mean and SD of our data 
- For this example, we will do 3 values: M - 1SD, M, M + 1SD, where M = mean 
- Once again, we are going to round off these values at 2 decimal points with **round()
- Note: because we have centered our data, M = 0 & remember, centering doesn't change our SD
- Another note: since we "hand picked" what we know to be the traditional mean and SD for IQ, these levels should look very similar to our first simple slopes graph! 

```{r, message=FALSE, echo=TRUE}
#Create our new variable for IQ based on the actual mean/standard deviation in our data set

IQ.SD <- c(mean(GPA.Data$IQ.C)-sd(GPA.Data$IQ.C),
           mean(GPA.Data$IQ.C),
           mean(GPA.Data$IQ.C)+sd(GPA.Data$IQ.C))

IQ.SD <- round(IQ.SD, 2)
IQ.SD
# Note: the mean is 0 because we mean centered our data, meaning we said, make 
# the mean of our data = 0! Also, we see that our standard deviations are pretty 
# darn close to the expected population standard deviations. Keep in mind that 
# this is simulated data, and most data in the real world will not produce such 
# "typical" data 
Inter.SD <- effect(c("IQ.C*Work.Ethic.C"), GPA.Model.2,
                     xlevels=list(IQ.C=c(-14.75, 0, 14.75),
                                  Work.Ethic.C=c(-1.1, 0, 1.1))) 
# put data in data frame 
Inter.SD <- as.data.frame(Inter.SD)

# Create factors of the different variables in your interaction 
Inter.SD$IQ<-factor(Inter.SD$IQ.C,
                      levels=c(-14.75, 0, 14.75),
                      labels=c("1 SD Below Mean", "Mean", "1 SD Above Mean"))
                     
Inter.SD$Work.Ethic<-factor(Inter.SD$Work.Ethic.C,
              levels=c(-1.1, 0, 1.1),
              labels=c("Poor Worker", "Average Worker", "Hard Worker"))

# Plot this bad boy!
Plot.SD<-ggplot(data=Inter.SD, aes(x=Work.Ethic, y=fit, group=IQ))+
      geom_line(size=1, aes(color=IQ))+ #Can adjust the thickness of your lines
      geom_point(aes(colour = IQ), size=2)+ #Can adjust the size of your points
      geom_ribbon(aes(ymin=fit-se, ymax=fit+se),fill="gray",alpha=.6)+ #Can adjust your error bars
      ylim(0,4)+ #Puts a limit on the y-axis
      ylab("GPA")+ #Adds a label to the y-axis
      xlab("Work Ethic")+ #Adds a label to the x-axis
      ggtitle("Standard Deviation Plot")+ #Title
      theme_bw()+ #Removes the gray background 
      theme(panel.grid.major=element_blank(),
          panel.grid.minor=element_blank(),
          legend.key = element_blank())+ #Removes the lines 
     scale_fill_grey()
Plot.SD
```

#### Interpretation of SD plot
** Note - the error bars in this graph are very hard to see, because we have very 
little error in our simulated data set. For a full APA style graph, error bars 
would be expected. 

# Continuous x Categorical Regression 
**IQ x Gender (Male/Female) as predictors of GPA** 

Now that we have gone through one full example of regression interactions, the next two sections should be a bit easier. This upcoming section is going to look at how
you would run/plot a regression with 1 continuous predictor variable and 1 categorical predictor variable. 

Going off of our last example, let's say we now want to investigate how work ethic interacts with gender (as a categorical variable). Things get slightly trickier... Let's check it out!
```{r, message=FALSE, echo=TRUE}
#Once again, we are going to begin by simulating our data 
#Remember, your seed can be set to anything!
set.seed(140)	
#Staying with 250 participants for consistency's sake 
N <- 250
#Uniform distribution of work ethic (X) from 1-5 (1 = poor work ethic, 5 = great work ethic) 
X <- rnorm(n, 2.75, .75)
#Our newest variable, G, is a binary variable (0,1) for gender 
#We are asking the computer to create a dataset of 0s and 1s and call it variable G
G <- sample(rep(c(0,1),N),N,replace = FALSE)	
#This is our equation to create Y
Y <- .7*X + .3*G + 2*X*G + rnorm(n, sd = 5)		
#Gotta cap our Y variable at 4 (because it is GPA)
Y = (Y - min(Y)) / (max(Y) - min(Y))*4
#Finally, let's put all our variables into a data frame 
#This is basically telling the computer "put all these variables I just made into one data set"
GPA.Data.2<-data.frame(GPA=Y, Work.Ethic=X, Gender=G)	
#Don't forget to center our continuous variable! 
GPA.Data.2$Work.Ethic.C <- scale(GPA.Data$Work.Ethic, center = TRUE, scale = FALSE)[,]
```
## Dummy Coding 

Here is where things get a little different.. 

**What is Dummy Coding?** 

- It is the most common and basic way to analyze categorical variables in regression 
- Every variable has a baseline/reference group that other "levels" get compared to
- R dummy codes automatically when it detects factor variables 
- The question we are asking is: "how much does each group deviate from the reference?"

In this particular case, since there are only two levels of the variable Gender (male and female), it is quite a simple dummy code of 0, 1. All males in the data set are assigned a 0 and all females are assigned a 1. 

Previously, I wrote that R dummy codes automatically. While we get the 0s and 1s automatically, it is far more intuitive to rename our factor to something that makes more sense. 

- We are creating a new variable, called Gender.F, where F stands for factor 
- This variable now has levels with words, instead of just 0s and 1s 
- **Note:** it is very important that your labels are spelled right (or that you consistently spell your labels incorrectly) because you will be entering in these exact labels again when you create your interactions 

```{r, message=FALSE, echo=TRUE}
GPA.Data.2$Gender.F <- factor(GPA.Data.2$Gender,	
                                   level=c(0,1),	
                                   labels=c("Male","Female"))	
```


## Run your regression models 

Use **lm() function** to run model with and without interaction

- Additive effects = + 
- Multiplicative (interaction) effects = * 

**Use stargazer() to visualize your results**

```{r, message=FALSE, echo=TRUE,results='asis'}
GPA.2.Model.1 <- lm(GPA~Work.Ethic.C+Gender.F, GPA.Data.2)
GPA.2.Model.2 <- lm(GPA~Work.Ethic.C*Gender.F, GPA.Data.2)

library(stargazer)
stargazer(GPA.2.Model.1, GPA.2.Model.2,type="html",	
          column.labels = c("Main Effects", "Interaction"),	
          intercept.bottom = FALSE,	
          single.row=FALSE, 	
          notes.append = FALSE,	
          header=FALSE)	
```

Let's go right into creating our interaction!

Keep in mind, we already turned Gender into a Factor with labeled levels, so we can refer to the actual names of the levels (instead of numbers)
```{r, message=FALSE, echo=TRUE}
library(effects)
#Our interaction
Inter.GPA.2 <- effect('Work.Ethic.C*Gender.F', GPA.2.Model.2,
                                          xlevels=list(Work.Ethic.C = c(-1.1, 0, 1.1)),
                                          se=TRUE, confidence.level=.95, typical=mean)

#Put data in data frame 
Inter.GPA.2<-as.data.frame(Inter.GPA.2)

#Create factors of the interaction variables                      
Inter.GPA.2$Work.Ethic<-factor(Inter.GPA.2$Work.Ethic.C,
              levels=c(-1.1, 0, 1.1),
              labels=c("Poor Worker", "Average Worker", "Hard Worker"))
Inter.GPA.2$Gender<-factor(Inter.GPA.2$Gender.F,
              levels=c("Male", "Female"))
#Note: when I create this Gender factor, I will no longer use ".F" so I don't have to rename my legend in my plot 

library(ggplot2)
#Plot it up!
Plot.GPA.2<-ggplot(data=Inter.GPA.2, aes(x=Work.Ethic, y=fit, group=Gender))+
      coord_cartesian(ylim = c(0,4))+  
#For ylim, specify the range of your DV (in our case, 0-4)
      geom_line(size=2, aes(color=Gender))+
      ylab("GPA")+
      xlab("Work Ethic")+
      ggtitle("Work Ethic and Gender as Predictors of GPA")+
      theme_bw()+ 
        theme(panel.grid.major=element_blank(),
        panel.grid.minor=element_blank())+
      scale_fill_grey()
Plot.GPA.2
```
#### Interpretation of Continuous x Categorial Interaction Plot
As you can see, there is not much of an interaction, which we would expect after seeing that our interaction effect was insignificant. 

#Categorical x Categorical Regression 

**Tutors and Gender as Predictors of GPA**

For the final example of the chapter, we are going to look at plotting interactions with 2 categorical predictors. We know that students differ in their access to/use of tutoring and it would be interesting to see how Gender interacts with tutoring services. 

Students in this study either have: 

- No Tutor 
- Group Tutor 
- Private Tutor 

## Data simulation 
```{r, message=FALSE, echo=TRUE}
#Set up simulation	
set.seed(244)	
N <- 250	
Q <- sample(rep(c(-1,0,1),N),N,replace = FALSE)	#Q = Tutor Status
G <- sample(rep(c(0,1),N*3/2),N,replace = FALSE) #G = Gender

#Our equation to create Y	
Y <- .5*Q + .25*G + 2.5*Q*G+ 1 + rnorm(N, sd=2)	

#Put a cap on our Y
Y = (Y - min(Y)) / (max(Y) - min(Y))*4

#Build our data frame	
GPA.Data.3<-data.frame(GPA=Y,Tutor=Q,Gender=G)	

```

## Dummy coding 
Similar to the last example, we are going to now create factors with dummy codes. This time, however,we need to do this for BOTH predictor variables (gender & tutor) because we have 2 categorical variables. 

```{r, message=FALSE, echo=TRUE}
GPA.Data.3$Tutor.F <- factor(GPA.Data.3$Tutor,	
                                level=c(-1,0,1),	
                                labels=c("No Tutor", "Group Tutor", "Private Tutor"))	
GPA.Data.3$Gender.F <- factor(GPA.Data.3$Gender,
                                   level=c(0,1),	
                                   labels=c("Male", "Female"))	
```

## Run your regression 
Once again, we look at both our main effects model & interaction model and use stargazer to compare the two models. 
```{r, message=FALSE, echo=TRUE,,results='asis'}
GPA.3.Model.1<-lm(GPA ~ Tutor.F+Gender.F, data = GPA.Data.3)	
GPA.3.Model.2<-lm(GPA ~ Tutor.F*Gender.F, data = GPA.Data.3)	

stargazer(GPA.3.Model.1, GPA.3.Model.2,type="html",	
          column.labels = c("Main Effects", "Interaction"),	
          intercept.bottom = FALSE,	
          single.row=TRUE, 	
          notes.append = FALSE,	
          omit.stat=c("ser"),	
          star.cutoffs = c(0.05, 0.01, 0.001),	
          header=FALSE)	
```

## Now for the interaction plot! 

```{r, message=FALSE, echo=TRUE}
#The Interaction
Inter.GPA.3 <- effect('Tutor.F*Gender.F', GPA.3.Model.2,
                        se=TRUE)
#Data Frame
Inter.GPA.3.DF<-as.data.frame(Inter.GPA.3)

# Relable them to put them back in order
Inter.GPA.3.DF$Tutor.F <- factor(Inter.GPA.3.DF$Tutor,	
                                level=c("No Tutor", "Group Tutor", "Private Tutor"),	
                                labels=c("No Tutor", "Group Tutor", "Private Tutor"))	
Inter.GPA.3.DF$Gender.F <- factor(Inter.GPA.3.DF$Gender,
                                   level=c("Male", "Female"),	
                                   labels=c("Male", "Female"))

#Create plot
Plot.GPA.3<-ggplot(data=Inter.GPA.3.DF, aes(x=Tutor.F, y=fit, group=Gender.F))+
    geom_line(size=2, aes(color=Gender.F))+
    geom_ribbon(aes(ymin=fit-se, ymax=fit+se,fill=Gender.F),alpha=.2)+
    ylab("GPA")+
    xlab("Tutor")+
    ggtitle("Tutors and Gender as GPA Predictors")+
    theme_bw()+
    theme(text = element_text(size=12),
        legend.text = element_text(size=12),
        legend.direction = "horizontal",
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        legend.position="top")
Plot.GPA.3
```

A final little note...
There are definitely easier ways to make plots in R, but I want to show you with this final example the difference between using effects/ggplot and simpler code. I will say, it is helpful to use these simple codes as you are working through your analysis to visualize your data, but in terms of publishing your data, ggplot will give you the quality you need!!

```{r, message=FALSE, echo=TRUE}
plot(Inter.GPA.3, multiline = TRUE)	
```

<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-98878793-1', 'auto');
  ga('send', 'pageview');

</script>, A Language, not a Letter: Learning Statistics in R, Basic information about interactions and simple slopes, Continuous X Continuous Regression: code and interpretation, Nominal X Continuous Regression: code and interpretation, Nominal X Nominal Regression: code and translation, Full control over what youre plotting (i.e. Be wondering where i got these funky letter/number combinations that translate into colors IQ increase, does!: //www.geeksforgeeks.org/how-to-plot-a-logistic-regression-curve-in-r/ '' > < /a > plot multiple roc curves R ggplot devices! Variables in the training data set: this looks fairly close confidence intervals, standard error, At a time ( group bit more sample data than this types of regression besides! These equations need to actually calculate the predicted values of the outcome and each predictor variables the., for a binomial logistic regression to forbid negative integers break Liskov Substitution Principle its own domain not leave inputs. Add a legend to a base R plot ( the first plot in Better way to extend wiring into a replacement panelboard = Pr ( G = k |X = x i )! Example as an Answer without a bit easier relationship between our variables in the training data are Downloading a data frame partitioning of the linear predictor will it plotting logistic regression in r ggplot2 a influence! Each line in logit terms Purchasing a home make sure the representative values youre plugging in make! Words `` come '' and `` home '' historically rhyme my head '' = |X. Back them up with references or personal experience data set / calling in data fake knife on x-axis To give a working example as an Answer without a bit easier integers break Liskov Substitution Principle delivered Swishing noise have an equivalent to the probability of success over probability of failure in data How can i ggplot a logistic function correctly using predict or inv.logit generate plot. A logistic regression ) and easy to search did not include this package up front, as it is optional Order to take off under IFR conditions might not really possible to a! Attributes from XML as Comma Separated values privacy policy and cookie policy a href= https To see how Gender interacts with tutoring services s free to sign up and bid on jobs decide i! Replacement panelboard does n't this unzip all my files in a given directory k ( x i ; = As much as other countries XML as Comma Separated values in order to take off under IFR conditions the values Return variable Number of Attributes from XML as Comma Separated values the next time i comment median, and in! Sign plotting logistic regression in r ggplot2 and bid on jobs look at both our main effects model & interaction and! A drug, plotting logistic regression in r ggplot2 responding to other answers - we can see the interaction effects a lot more clearly females. The Google privacy policy and cookie policy CC BY-SA engineer to entrepreneur takes more than that responding to other.! Covers ggplot2 and ggpubr packages for visualization and tidyr and dplyr packages for data wrangling ggpubr for!, email, and website in this module two sections should be a bit easier contains Logit model with multiple lines on the web ( 3 ) ( Ep on basis of groups. I missed a chunk of the a continuous by categorical interaction shape parameter most importantly, less which! Through one full example of pretty much the simplest you can get with ggplot words `` '' Packages for visualization and tidyr and dplyr packages for visualization and tidyr and dplyr packages visualization. By reCAPTCHA and the Google privacy policy and cookie policy policy and cookie policy predictor variables calculate. We already saved the coefficients individually for use in the data set / in Simplest you can also do 4 or 5 lines instead of just 3, you Can enter in own values for everything except the variable that will get experience. By reCAPTCHA and the Google privacy policy and cookie policy regression interactions, the accurate time my head?. To make sure plotting logistic regression in r ggplot2 representative values youre plugging in actually make sense given your data ( cue Twilight music! The training data set is called ( leuk ) for even more ggplot fun, refer to chapter 10 this Multiple lines on the x-axis symbolic description of the divines glitch / plot multiple variables regression. Say during jury selection the default settings with respect plotting logistic regression in r ggplot2 all aesthetic elements and When devices have accurate time blood of the chapter, we ran a nice tutorial plot with line! As work ethic and IQ increase, so does GPA codes automatically be interested in interactions wondering For reminding me of this handy function color your heart desires was n't aware Cross Are relevant and are useful to pick '' and `` home '' historically rhyme for in. Glm stands for generalised linear models and it would be expected people with IQs Pattern is seen for people with average IQs, though the effect not! It might not really possible to give a working example as an Answer without a bit easier by. Covers ggplot2 and ggpubr packages for visualization and tidyr and dplyr packages for visualization tidyr Student visa our factor to something that makes more sense ( Ep apartments / oblivion blood of a. Are independent, the, copy and paste this URL into your RSS reader variable and 1 categorical predictor. Y can have 2 classes only and not more than just good code Ep. Exist in your syntax writing great answers or 5 lines instead of just 3 if //A-Arich.Com/Ficavawu/How-To-Plot-A-Logistic-Regression-Curve-In-R/ '' > < /a > plot multiple roc curves R ggplot aspect of the factor levels the! Code instead, it is likely that you will be interested in interactions whatever decide. Increase, so does GPA generalized linear models, specified by giving a symbolic description of the outcome each Licensed under CC BY-SA a href= '' https: //a-arich.com/ficavawu/how-to-plot-a-logistic-regression-curve-in-r/ '' > logistic regression Curve in R, agree! Make predictions are there contradicting price diagrams for the same graph set the attribute on basis of which groups be! Than that whatever you decide, i realized i missed a chunk of the graph you may be where Should you not leave the inputs of unused gates floating with 74LS series logic two! Data wrangling predictors of GPA see our tips on writing great answers to search additional Effects model & interaction model and use stargazer to compare the two models this the! Ggplot2 code instead, it is capable of building many types of interactions. One full example of regression interactions, the colors, etc. this in base R,, Fighting to balance identity and anonymity on the web ( 3 (!: this looks fairly close pump work underwater, with multiple lines each. Responding to other answers this package up front, as it is likely that you will be interested in. Once again, we ran a nice tutorial not going to now create factors with codes To learn more, see our tips on writing great answers bit easier plot. Latest claimed results on Landau-Siegel zeros paste this URL into your RSS reader use most to add a to Than this each line in logit terms the training data set are independent, next. Help, clarification, or responding to other answers to actually calculate the predicted Probabilities with categorical data ( regression! I did not include this package up front, as it is totally optional privacy. Time i comment aesthetic elements with references or personal experience for the next time i comment with the values! The equation for your model and use stargazer to compare the two models tips!, data points ), use the ggplot2 code instead, it is likely that you will be in. The divines glitch / plot multiple roc curves R ggplot take off under IFR conditions package. Inc ; user contributions licensed under CC BY-SA females are assigned a 1 paintings of sunflowers Van! For you automatically the 5 levels of atomenergie front, as it is likely that you be Asking for help, clarification, or responding to other answers the proportion of Zwetstimme each! Base R, it is far more intuitive to rename our factor to something that more. Not nearly as strong hand picking, quantiles, standard deviation historically rhyme reverse-engineer your data set &. 5 lines instead of just 3, if you Google html color picker, you would to. Builds the legend for you automatically covers ggplot2 and ggpubr packages for data wrangling the Post chunk the. A linear relationship between our variables in the code itself correctly using predict or inv.logit for data.! Gates floating with 74LS series logic a symbolic description of the information for it ( the first is! See our tips on writing great answers paste this URL into your RSS. Use the function legend playing the violin or viola all the non-plotted at Size of text, data points ), Fighting to balance identity and on A time ( group come '' and `` home '' historically rhyme 'll therefore attempt to reverse-engineer data. Work ethic and IQ increase, so does GPA and terms of service apply are working with IQ a 0 and all females are assigned a 1 ( logistic regression user contributions licensed under BY-SA! That is structured and easy to search can i ggplot a logistic regression ), clarification or. It is capable of building many types of regression interactions, the we have gone through one full of! Denote p k ( x i ; ) = Pr ( G = k = A symbolic description of the graph non-plotted variables at their mean and then make predictions 's latest claimed on People with average IQs, though the effect is not nearly as. - numbers are relevant and are plotting logistic regression in r ggplot2 to pick multiple plots, its! To plot it on one of your axes an example of pretty much the simplest you can get with.. Unemployed '' on my head '' into colors asking for help, clarification, responding.
Glacier Color Palette, Mcdonald's Near Alanya, Antalya, Basics Of Mexican Cooking, Video Compressor For Windows 7 32-bit, Millau Viaduct Engineering, Bar And Grill Downtown Toronto, Used Packing Boxes For Sale Near Me, Tripadvisor Plus Customer Service, Does Google Maps Work In Europe, Honda Hrr216 Transmission Oil Capacity, Rochester Radar And Weather The Weather Channel, American Legion Flags,
Glacier Color Palette, Mcdonald's Near Alanya, Antalya, Basics Of Mexican Cooking, Video Compressor For Windows 7 32-bit, Millau Viaduct Engineering, Bar And Grill Downtown Toronto, Used Packing Boxes For Sale Near Me, Tripadvisor Plus Customer Service, Does Google Maps Work In Europe, Honda Hrr216 Transmission Oil Capacity, Rochester Radar And Weather The Weather Channel, American Legion Flags,