Posts on The Big Short

Posts on The Big Short /post/ Recent content in Posts on The Big Short Hugo -- gohugo.io © 2021 <a href="https://www.wangchucheng.com/">C. Wang</a> and <a href="https://www.ruiqima.com/">R. Ma</a> Wed, 21 Apr 2021 00:00:00 +0000 Blog Post 7 /post/2021-04-21-post-title/ Wed, 21 Apr 2021 00:00:00 +0000 /post/2021-04-21-post-title/ # Lasso Variable Selection set.seed(902333) ss <- DataframeName %>% select(SPREAD, SP500, GOLD, OIL, CHHUSD, JPYUSD, RGDP, UNRATE, IPI, 'Copper Price', 'Median Income', BCI, CCI, rec) x <- model.matrix(ss$rec~., ss) y <- ss$rec cv_mod_ss <- cv.glmnet(x, y, alpha = 1) plot(cv_mod_ss) coef(cv_mod_ss) ## 15 x 1 sparse Matrix of class "dgCMatrix" ## 1 ## (Intercept) 2.057904e+00 ## (Intercept) . ## SPREAD -1.604249e-01 ## SP500 -3.292787e-04 ## GOLD . ## OIL 6.202401e-03 ## CHHUSD 8. Blog Post 6 /post/2021-04-16-post-title/ Fri, 16 Apr 2021 00:00:00 +0000 /post/2021-04-16-post-title/ suppressWarnings(suppressMessages(library("tidyverse"))) # Running Pre-Process R Script source(here::here("content", "load_and_clean_data.R")) # Reading Main Dataset dataset <- read_csv(here::here("dataset", "Main_Dataset.csv")) # Multi-Collinearity Test Mulcol<-data.frame(dataset$SPREAD, dataset$SP500, dataset$GOLD, dataset$OIL, dataset$CHHUSD, dataset$JPYUSD, dataset$RGDP, dataset$UNRATE, dataset$rec) round(cor(Mulcol), digits = 2) ## dataset.SPREAD dataset.SP500 dataset.GOLD dataset.OIL ## dataset.SPREAD 1.00 -0.12 0.34 0.30 ## dataset.SP500 -0.12 1.00 0.65 0.50 ## dataset.GOLD 0.34 0.65 1.00 0.80 ## dataset.OIL 0.30 0.50 0.80 1.00 ## dataset.CHHUSD -0.32 -0.55 -0.88 -0.74 ## dataset.JPYUSD -0.34 -0.20 -0.60 -0. Blog Post 2 /post/2021-04-07-post-title/ Wed, 07 Apr 2021 00:00:00 +0000 /post/2021-04-07-post-title/ # Data Manipulation yc_clean = yc %>% mutate(Date = ymd(Date)) %>% mutate_at(vars(Date), funs(year, month, day)) # Initial EDA yc_clean %>% ggplot(aes(x= Date,y=SPREAD)) + geom_line(color = "blue") + labs(title = "Spread over time") + theme(plot.title = element_text(hjust = 0.5)) During times of recession the spread, which is the difference between the 30 year and 1 year maturity, yields a negative value. The years following a recession we see the economy head into expansionary periods where the spread yields positive values. Blog Post 5 /post/2021-04-05-post-title/ Mon, 05 Apr 2021 00:00:00 +0000 /post/2021-04-05-post-title/ To our main yield curve dataset, we are adding at least 10 additional datasets. We are combining the datasets based on the date format from our main dataset. During this process of combining data there were a few problems we encountered. One problem was finding data that went far back enough in time so we can preserve the main dataset with the new data we are adding. Another problem we faced was figuring out how to extend data beyond its time frame, so we can have sufficient data points. Blog Post 4 /post/2021-03-29-post-4/ Mon, 29 Mar 2021 00:00:00 +0000 /post/2021-03-29-post-4/ # Initial Model Selection Testing dfsub <- df %>% select(SPREAD, SP500, GOLD, OIL, CHHUSD, JPYUSD, RGDP, UNRATE, rec) model <- glm(rec~SPREAD+SP500+GOLD+OIL+CHHUSD+JPYUSD+RGDP+UNRATE, data=df, family=binomial) model.null = glm(rec ~ 1, data=dfsub, family = binomial) anova(model, model.null, test="Chisq") ## Analysis of Deviance Table ## ## Model 1: rec ~ SPREAD + SP500 + GOLD + OIL + CHHUSD + JPYUSD + RGDP + ## UNRATE ## Model 2: rec ~ 1 ## Resid. Df Resid. Dev Df Deviance Pr(>Chi) ## 1 161 69. Blog Post 3 /post/2021-03-26-post-3/ Fri, 26 Mar 2021 00:00:00 +0000 /post/2021-03-26-post-3/ # Data Manipulation yc <- yc %>% mutate(Date = ymd(Date)) %>% mutate_at(vars(Date), funs(year, month, day)) # Changing Col names colnames(rgdp)[colnames(rgdp) == "A191RL1Q225SBEA"] <- 'RGDP' colnames(rgdp)[colnames(rgdp) == "DATE"] <- 'Date' colnames(UE)[colnames(UE) == "DATE"] <- 'Date' colnames(rec_indicator)[colnames(rec_indicator) == "DATE"] <- 'Date' colnames(rec_indicator)[colnames(rec_indicator) == "JHDUSRGDPBR"] <- 'rec' # Joining Datasets temp <- yc %>% inner_join(rgdp, by = "Date") temp2 <- temp %>% inner_join(UE, by = "Date") df <- temp2 %>% inner_join(rec_indicator, by = "Date") # create column to write "yes" for recession and "no" if no rec df$rec_char = ifelse(df$rec == 0, "No", "Yes") # Creating Figure df_2 = df %>% group_by(year) %>% mutate(mean_spread = mean(SPREAD, na. Blog Post 1 /post/2021-03-10-post-1/ Wed, 10 Mar 2021 00:00:00 +0000 /post/2021-03-10-post-1/ We plan to work on data of the US bond yield curve from 1977-2018. Some of the variables include the date, the values of the year curve for 1 year and 30 years maturities, and the difference between the 30 year and 1 year maturities, called the “spread”. This data was extracted from Bloomberg and is used to predict an economic recession. Yield curves are good indicators of economic recessions because if the yield curve inverts, then it is suggested that investors think it is risky to hold bonds over the short term, so they are demanding a higher yield for short term bonds.