Data Analysis Project

Data Analysis Project [15 marks]
Due date: 5pm, October 4, 2019
Requirement: Pls finish the tasks according to the requirements. Pls copy all of your R code in the end of this word file, and then submit your word file via the Turnitin link in iLearn. You do not need to upload your data file.
Task 1: CAPM and Fama French three factors model. [6 marks]
a.Pls download the daily price data of the assigned stock from Yahoo finance
b.Download Fama French daily three factors data from Ken French’s data library (https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html).

c.Calculate daily log return of your assigned stock using the daily adjusted price (Adj Close).

Data Analysis作业代做、代写R编程语言作业、代做CAPM留学生作业、代写R程序设计作业

d.Merge your stock data with the Fama French three factors data. [Hint: you can use the function of “merge”], and fill up the following summary statistic table. [Ensure stock return and three factors are measured in the same scale.] [2 marks]

(Mkt.Rf stands for market risk premium, SD stands for standard deviation and N stands
for number of observations)
e.Pls create a new variable Stock.Rf, which equals to the difference between Stock Return and RF, and run a regression of :
.
Report the regression results in a table (1 mark) and also make a figure, where Mkt.Rf is in the X-axis, and Stock.Rf is in the Y-axis, and also plot the fitted line in the figure (1 mark).

f.Pls run a regression of :
.
Report the regression results in a table (1 mark) and make your comments by comparing two tables in e and f (1 mark).

Task 2: [9 marks]
In the attached dataset named as “firm financial dataset”, you are provided with the financial information of US firms in a long sample period. It is a panel dataset.
Here is a list of variables:
gvkey: firm identifier; each firm has a unique identifier;
fyear: fiscal year of the data
conm: firm name
sic: SIC industry code (https://en.wikipedia.org/wiki/Standard_Industrial_Classification)
at: total asset (millions)
bkvlps: book value per share (dollars)
ceq: common equity (millions)
che: cash and short-term investments (millions)
csho: common shares outstanding (millions)
dlc: debt in current liabilities (millions)
dltt: long term debt (millions)
ebitda: earnings before interest, depreciation and amortisation (millions)
invt: inventory (millions)
ppent: property, plant and equilibrium (millions)
sale: net sales (millions)
prcc_f: end of fiscal year share price (dollars)

Requirements:

a. First, keep observations with SIC codes between 2000 to 3999 (manufacturing firms) and keep observations with fiscal year between 1975 and 2015.
Second, construct variables based on the sample remaining from the first step using the following definitions:
• Market leverage: market leverage=
total liability/market value of asset=(DLTT+DLC)/( DLTT+DLC+ CSHO*PRCCF”)
• Cash holding: cash holding=cash and short-term investments/total asset=CHE/AT;
• Tangibility: tangibility=(Inventory+ Net Property, Plant and Equipment)/Total Asset=(INVT+PPENT)/AT;
• Market to book ratio: M/B= stock price of fiscal year/book value per share=PRCC_F/BKVLPS;
• Size: size=log(SALE)
• Profitability: profitability=EBITDA/Total Assets=EBITDA/TA;

Third, combine these variables with the original dataset and clean up data to:
exclude Inf,- Inf, and NAs;
exclude negative market leverage, tangibility, and market to book ratio measures;
[ 3 marks]

b) Run an Ols regression. The Y variable is Book leverage. The X variables are Tangibility, Market to book ratio, Size and Profitability.

You can use the method of Rajan and Zingales (1995) by averaging the Y and X variables across fiscal years for each firm and then run the OLS regression.

[Note: by doing so, we finally reach a cross sectional data set, and each firm has one Y variable and four X variables.] ;

Present your estimation results in an organized table (similar to Table IX of Ranjan and Zingales, 1995). [2 marks]
Also briefly discuss your results in the context of capital structure theories. [1 mark]

c) Media commentators have been saying that U.S. corporations hold too much cash for a while.[ https://www.nytimes.com/2013/03/10/opinion/sunday/putting-corporate-cash-to-work.html?_r=0 ] Plot the median and mean of cash holding by year. Is there any trend in cash holding in your sample? (2 marks) If there is a trend, pls discuss in your opinion, why U.S. firms are holding more cash over time. (1 mark)

[Note: each year from 1975 to 2015, you need to calculate the median and mean of the cash holdings for all the firms.].

R code:

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:[email protected]

微信:codehelp

原文地址:https://www.cnblogs.com/YELLOBRICK/p/11551687.html

时间: 2024-11-12 09:34:30

Data Analysis Project的相关文章

Learning Spark: Lightning-Fast Big Data Analysis 中文翻译

Learning Spark: Lightning-Fast Big Data Analysis 中文翻译行为纯属个人对于Spark的兴趣,仅供学习. 如果我的翻译行为侵犯您的版权,请您告知,我将停止对此书的开源翻译. Translation the book of Learning Spark: Lightning-Fast Big Data Analysis is only for spark developer educational purposes. If I violated you

Data analysis system

A data analysis system, particularly, a system capable of efficiently analyzing big data is provided. The data analysis system includes an analyst server, at least one data storage unit, a client terminal independent of the analyst server, and a cach

Spark的Python和Scala shell介绍(翻译自Learning.Spark.Lightning-Fast.Big.Data.Analysis)

Spark提供了交互式shell,交互式shell让我们能够点对点(原文:ad hoc)数据分析.如果你已经使用过R,Python,或者Scala中的shell,或者操作系统shell(例如bash),又或者Windows的命令提示符界面,你将会对Spark的shell感到熟悉. 但实际上Spark shell与其它大部分shell都不一样,其它大部分shell让你通过单个机器上的磁盘或者内存操作数据,Spark shell让你可以操作分布在很多机器上的磁盘或者内存里的数据,而Spark负责在集

Python For Data Analysis -- NumPy

NumPy作为python科学计算的基础,为何python适合进行数学计算,除了简单易懂,容易学习 Python可以简单的调用大量的用c和fortran编写的legacy的库   The NumPy ndarray: A Multidimensional Array Object ndarray,可以理解为n维数组,用于抽象矩阵和向量 Creating ndarrays 最简单的就是,从list初始化, 当然还有其他的方式,比如, 汇总,     Data Types for ndarrays

Python For Data Analysis -- Pandas

首先pandas的作者就是这本书的作者 对于Numpy,我们处理的对象是矩阵 pandas是基于numpy进行封装的,pandas的处理对象是二维表(tabular, spreadsheet-like),和矩阵的区别就是,二维表是有元数据的 用这些元数据作为index更方便,而Numpy只有整形的index,但本质是一样的,所以大部分操作是共通的 大家碰到最多的二维表应用,关系型数据库中的表,有列名和行号,这些就是元数据 当然你可以用抽象的矩阵来对这些二维表做统计,但使用pandas会更方便  

《Python For Data Analysis》学习笔记-1

在引言章节里,介绍了MovieLens 1M数据集的处理示例.书中介绍该数据集来自GroupLens Research(http://www.groupLens.org/node/73),该地址会直接跳转到https://grouplens.org/datasets/movielens/,这里面提供了来自MovieLens网站的各种评估数据集,可以下载相应的压缩包,我们需要的MovieLens 1M数据集也在里面. 下载解压后的文件夹如下: 这三个dat表都会在示例中用到,但是我所阅读的<Pyt

About Data Analysis

About Data Analysis 工具不能解决代码中的问题.它可以帮助你更好地了解你的代码正在做什么,通过捕捉应用程序运行时的详细统计信息,并将它们呈现给你进行分析.由于每个应用程序都不同,查找和解决问题的实际步骤各不相同.因此,您必须学习如何通过过滤不需要的数据来解释信息工具,并钻入与应用程序相关的数据.然后,您必须执行一些检查工作,将您识别的任何数据与应用程序中的代码关联起来,这样您就可以进行改进.Instruments doesn't fix problems with your c

Python For Data Analysis -- IPython

IPython Basics 首先比一般的python shell更方便一些 比如某些数据结构的pretty-printed,比如字典 更方便的,整段代码的copy,执行 并且可以兼容部分system shell , 比如目录浏览,文件操作等   Tab Completion 这个比较方便,可以在下面的case下,提示和补全未输入部分 a. 当前命名空间中的名字 b.对象或模块的属性和函数 c. 文件路径   Introspection, 内省 ?,在标识符前或后加上,显示出对象状况和docst

Python 探索性数据分析(Exploratory Data Analysis,EDA)

此脚本读取的是 SQL Server ,只需给定表名或视图名称,如果有数据,将输出每个字段符合要求的每张数据分布图. # -*- coding: UTF-8 -*- # python 3.5.0 # 探索性数据分析(Exploratory Data Analysis,EDA) __author__ = 'HZC' import math import sqlalchemy import numpy as np import pandas as pd import matplotlib.pyplo