Posts

Showing posts from October, 2024

Time-series

Image
 Daniel Tafmizi Dr. Friedman November 4, 2024 Lis 4317 Module 10 Github:  daniel.R/Work.R/LIS4370Rprog/YOYexchanges.R at main · DanielDataGit/daniel.R My first graph is a simple time series analysis with multiple variables. This is an extension of my last assignments plots, which showed market cap over time.  This graph had the issue of not showing the changes in the smaller exchanges, as the nasdaq and nyse were about 5 times bigger than them. Market cap also does not offer much value. To fix this issue, I accumulated how the year over year percentages changed. This was fairly difficult to do. I had trouble finding resources on an efficient way to sum for each year. Eventually, I used dplyr mutate and cumsum to calculate the changes cumulatively. I used ggplot to graph it                                           My second graph is a time series fo...

Multivariate analysis

Image
Daniel Tafmizi Dr. Friedman October 29, 2024 Lis 4317 Module 9 Github:  daniel.R/Work.R/LIS4370Rprog/stockexchange.R at main · DanielDataGit/daniel.R This week I worked on preparing for the final project. I want to create something similar to the Gapminder life-expectancy vs gdp per capita graph. I would like to create one that shows the market cap of major stock exchanges vs their yty % change, over time. In beginning this endeavor, I searched and have found a good resource for the data at world-exchanges.org. I used their "statistics portal" to get some data for a multivariable visualization.  The graph shows the market cap and YTY change in 2023 for a few of the largest stock exchanges. I used colors to group by region. I used size to show how many companies make up the stock exchange.  Alignment: This can be added by maintaining a certain style amongst elements of a similar connection. Repetition: This can be added by using a consistent style to tie together separate ...

Corr analysis with ggplot2

Image
Daniel Tafmizi Dr. Friedman October 17, 2024 Lis 4317 Module 8 Github:  daniel.R/Work.R/LIS4370Rprog/mod8.R at main · DanielDataGit/daniel.R (github.com) I attempted to recreate a visualization seen in Few's book on pg. 277. The goal of this graph was to break down many elements of the large dataset into a visually approachable manner, while also showing correlation data. I initially ran into some trouble because I tried using facet_wrap, which is only for one discrete variable. After viewing some documentation (links in the github), I realized I need to use facet grid since I had two discrete variables (auto and manual). Using stat_cor implemented the correlation coefficient and the p value. I used theme elements to make the graph prettier. I agree with Few's recommendations. Correlation analysis on a large dataset can get very confusing very quickly. It is important to break down the data set to make it more approachable. I think I accomplished this in the above graph that di...

Mod 7 R

Image
Daniel Tafmizi Dr. Friedman October 15, 2024 Lis 4317 Module 7 Github:  daniel.R/Work.R/LIS4370Rprog/dataviz2.R at main · DanielDataGit/daniel.R (github.com) I used two methods to analyze the distribution of the mtcars dataset. First, I created a scaled heatmap to visualize how various attributes change across each car, ordered by mpg. Although this method doesn't provide a direct statistical analysis of the data's distribution, it offers valuable insights into how the attributes are spread out across different vehicles. Second, I utilized a function called ggpairs, which was new to me and proved to be a powerful tool for distribution analysis. This function generates three types of visualizations: density plots, scatterplots, and correlation coefficients. These visualizations observe the distribution of data both within individual attributes and across multiple attributes, while also highlighting how these variables are correlated. My next goal with this function is to learn h...

Mod 6 R Visualization

Image
Daniel Tafmizi Dr. Friedman October 7, 2024 Lis 4317 Module 6 Github:  daniel.R/Work.R/LIS4370Rprog/rVIz.R at main · DanielDataGit/daniel.R (github.com) The graphical representations above draw inspiration from Few's multiple distribution displays. Utilizing the Johnson & Johnson dataset, which provides a time series of quarterly earnings from 1960 to 1980, I aimed to compare the quarters and examine potential seasonality in profits. The first graph illustrates a consistent trend up to 1975, after which Q4 earnings appear to fall short in comparison to Q1, Q2, and Q3. This observation is reinforced by the box plot, which highlights that Q4 lacks high values beyond the 75th percentile, an icon of Q1, Q2. and Q3's later years. Notably, Q3 exhibited the highest earnings ranges during the 1960-1980 period. To deepen the analysis, further research could investigate how these earnings impact stock prices and explore the correlations associated with the increased Q3 values.