The Big Short

The Big Short / Recent content on The Big Short Hugo -- gohugo.io © 2021 <a href="https://www.wangchucheng.com/">C. Wang</a> and <a href="https://www.ruiqima.com/">R. Ma</a> Wed, 21 Apr 2021 00:00:00 +0000 Blog Post 7 /post/2021-04-21-post-title/ Wed, 21 Apr 2021 00:00:00 +0000 /post/2021-04-21-post-title/ # Lasso Variable Selection set.seed(902333) ss <- DataframeName %>% select(SPREAD, SP500, GOLD, OIL, CHHUSD, JPYUSD, RGDP, UNRATE, IPI, 'Copper Price', 'Median Income', BCI, CCI, rec) x <- model.matrix(ss$rec~., ss) y <- ss$rec cv_mod_ss <- cv.glmnet(x, y, alpha = 1) plot(cv_mod_ss) coef(cv_mod_ss) ## 15 x 1 sparse Matrix of class "dgCMatrix" ## 1 ## (Intercept) 2.057904e+00 ## (Intercept) . ## SPREAD -1.604249e-01 ## SP500 -3.292787e-04 ## GOLD . ## OIL 6.202401e-03 ## CHHUSD 8. Blog Post 6 /post/2021-04-16-post-title/ Fri, 16 Apr 2021 00:00:00 +0000 /post/2021-04-16-post-title/ suppressWarnings(suppressMessages(library("tidyverse"))) # Running Pre-Process R Script source(here::here("content", "load_and_clean_data.R")) # Reading Main Dataset dataset <- read_csv(here::here("dataset", "Main_Dataset.csv")) # Multi-Collinearity Test Mulcol<-data.frame(dataset$SPREAD, dataset$SP500, dataset$GOLD, dataset$OIL, dataset$CHHUSD, dataset$JPYUSD, dataset$RGDP, dataset$UNRATE, dataset$rec) round(cor(Mulcol), digits = 2) ## dataset.SPREAD dataset.SP500 dataset.GOLD dataset.OIL ## dataset.SPREAD 1.00 -0.12 0.34 0.30 ## dataset.SP500 -0.12 1.00 0.65 0.50 ## dataset.GOLD 0.34 0.65 1.00 0.80 ## dataset.OIL 0.30 0.50 0.80 1.00 ## dataset.CHHUSD -0.32 -0.55 -0.88 -0.74 ## dataset.JPYUSD -0.34 -0.20 -0.60 -0. Blog Post 2 /post/2021-04-07-post-title/ Wed, 07 Apr 2021 00:00:00 +0000 /post/2021-04-07-post-title/ # Data Manipulation yc_clean = yc %>% mutate(Date = ymd(Date)) %>% mutate_at(vars(Date), funs(year, month, day)) # Initial EDA yc_clean %>% ggplot(aes(x= Date,y=SPREAD)) + geom_line(color = "blue") + labs(title = "Spread over time") + theme(plot.title = element_text(hjust = 0.5)) During times of recession the spread, which is the difference between the 30 year and 1 year maturity, yields a negative value. The years following a recession we see the economy head into expansionary periods where the spread yields positive values. Blog Post 5 /post/2021-04-05-post-title/ Mon, 05 Apr 2021 00:00:00 +0000 /post/2021-04-05-post-title/ To our main yield curve dataset, we are adding at least 10 additional datasets. We are combining the datasets based on the date format from our main dataset. During this process of combining data there were a few problems we encountered. One problem was finding data that went far back enough in time so we can preserve the main dataset with the new data we are adding. Another problem we faced was figuring out how to extend data beyond its time frame, so we can have sufficient data points. Blog Post 4 /post/2021-03-29-post-4/ Mon, 29 Mar 2021 00:00:00 +0000 /post/2021-03-29-post-4/ # Initial Model Selection Testing dfsub <- df %>% select(SPREAD, SP500, GOLD, OIL, CHHUSD, JPYUSD, RGDP, UNRATE, rec) model <- glm(rec~SPREAD+SP500+GOLD+OIL+CHHUSD+JPYUSD+RGDP+UNRATE, data=df, family=binomial) model.null = glm(rec ~ 1, data=dfsub, family = binomial) anova(model, model.null, test="Chisq") ## Analysis of Deviance Table ## ## Model 1: rec ~ SPREAD + SP500 + GOLD + OIL + CHHUSD + JPYUSD + RGDP + ## UNRATE ## Model 2: rec ~ 1 ## Resid. Df Resid. Dev Df Deviance Pr(>Chi) ## 1 161 69. Blog Post 3 /post/2021-03-26-post-3/ Fri, 26 Mar 2021 00:00:00 +0000 /post/2021-03-26-post-3/ # Data Manipulation yc <- yc %>% mutate(Date = ymd(Date)) %>% mutate_at(vars(Date), funs(year, month, day)) # Changing Col names colnames(rgdp)[colnames(rgdp) == "A191RL1Q225SBEA"] <- 'RGDP' colnames(rgdp)[colnames(rgdp) == "DATE"] <- 'Date' colnames(UE)[colnames(UE) == "DATE"] <- 'Date' colnames(rec_indicator)[colnames(rec_indicator) == "DATE"] <- 'Date' colnames(rec_indicator)[colnames(rec_indicator) == "JHDUSRGDPBR"] <- 'rec' # Joining Datasets temp <- yc %>% inner_join(rgdp, by = "Date") temp2 <- temp %>% inner_join(UE, by = "Date") df <- temp2 %>% inner_join(rec_indicator, by = "Date") # create column to write "yes" for recession and "no" if no rec df$rec_char = ifelse(df$rec == 0, "No", "Yes") # Creating Figure df_2 = df %>% group_by(year) %>% mutate(mean_spread = mean(SPREAD, na. Blog Post 1 /post/2021-03-10-post-1/ Wed, 10 Mar 2021 00:00:00 +0000 /post/2021-03-10-post-1/ We plan to work on data of the US bond yield curve from 1977-2018. Some of the variables include the date, the values of the year curve for 1 year and 30 years maturities, and the difference between the 30 year and 1 year maturities, called the “spread”. This data was extracted from Bloomberg and is used to predict an economic recession. Yield curves are good indicators of economic recessions because if the yield curve inverts, then it is suggested that investors think it is risky to hold bonds over the short term, so they are demanding a higher yield for short term bonds. About /about/ Fri, 12 Feb 2021 09:20:50 -0500 /about/ This is a website for the final project for MA[46]15 Data Science with R by Team: The Big Short. The members of this team are below. Stephen Nalepa Stephen is a senior majoring in Economics and Mathematics at Boston University. Stephen’s GitHub Account Abdullah Albijadi Economics student at Boston University. Abdullah’s Github Account Gil Lotzky Gil is a senior Human-Computer Interaction undergraduate student at Boston University. Gil’s GitHub Account Alex Jalali Statistics undergraduate student at Boston University. Analysis /analysis/ Mon, 01 Jan 0001 00:00:00 +0000 /analysis/ Our data analysis is motivated by looking at relationships between different economic indicators and whether or not a recession occurred. We are particularly interested in looking at inverted yield curves, unemployment rate, real GDP, and the S&P 500. Using logic and economic knowledge, we have ideas of how these indicators change in relation to a recession, for example we know real GDP drops when a recession happens. However, we plan on exploring these relationships more in detail to gauge how significant of an impact these economic indicators have on a recession. Big Picture /big_picture/ Mon, 01 Jan 0001 00:00:00 +0000 /big_picture/ Impact of Economic Recession Indicators A recession is a decline in economic activity that impacts individuals and businesses. When economic activity decreases, revenues and profits decline as well. Businesses then need to decrease their costs, and let go of employees. Companies and individuals are unable to pay off their loans, which leads to large debt. Stock prices and dividends decline, people get laid off, and small businesses close. Individuals have to resort to changing their lifestyles to adjust to their loss of income. The Big Short Data /data/ Mon, 01 Jan 0001 00:00:00 +0000 /data/ Data Collection Procedure The data we used for this project comes from various sources. Specifically, our data comes from a manicured Kaggle dataset, datasets from FRED, datasets from the Organization for Economic Co-operation and Development, and one dataset from a database server for Economic data called Macrotrends. We were able to find these datasets by searching for key indicators that affect recessions. For example, we would search for “Unemployment Rate”, and once the data was presented before us, we made sure the data contained “Unemployment Rates”, but also a time component.