# Time-series R function design

• Status Ditutup
• Anggaran €250 - €750 EUR
• Total Penawaran 11

## Deskripsi Proyek

We need R functions to help analyse historical financial time-series data. We're interested broadly in determining relationships between returns, and causal structures for forecasting. Data will be in CSV in a standardized format and will cover thousands of distinct time series in either x-minute resolution format or daily. This is NOT a school assignment! Functions will be tested on multiple data sets to ensure they work as designed.

Four functions are required:

1) Correlation coefficients for items in user-specified CSV folder, given user--defined subset date-range to define data points. Output as CSV symmetrical matrix. (Only daily data will be used here)

Example of function:

CorrFunc(FolderPath,FirstDate,LastDate,OutputLocation)

Example output:

A B

A 1 0.8

B 0.8 1

2) Johansen/E-G/ADF cointegration test for items in user-specified CSV folder, given user-defined subset date-rangeSome meaningful measure of "strength" such as significance. Ordering of strength per base item for cointegrated items. 2 CSV matrix outputs: First one dummy yes/no for unit root and a second matrix for other relevant information.

Example of function:

CointFunc(FolderPath,FirstDate,LastDate,Min-p-Val,OutputLocation)

Example output:

A B

A 1 1

B 1 1

A B

A 3.2 [url removed, login to view] 2.5 [url removed, login to view]

B 2.5 [url removed, login to view] 3.2 [url removed, login to view]

3) Linear regression for items in two user specified CSV folders (dependent and independent, for example folder 1 has 50 files, folder 2 has 2000 files, each has to be tested against the other) given user-defined subset time for returns and date-range. The time-series data has gaps for a variety of reasons which we need to account for by selecting only useful data, e.g. where we can measure returns from time-A to time-B on both dependent and independent. Time A can be on a preceding day to time B (e.g. 3pm preceding day to 10am). A smart way of doing this has to be implemented so we are comparing apples to apples and also not discarding data unnecessarily. We want to be able to define return minimum/maximum returns on both dependent and independent variables separately as an option.

Example of function:

LinRegFunc(FolderPathDep,FolderPathIndep,FirstDate,LastDate,Time1,Time2,DepMinR,DepMaxR,IndepMinR,IndepMaxR,OutputLocation)

Example output:

Independent Dependent Slope SDE window n

Z A 1.2 [url removed, login to view] [url removed, login to view]% 620

Z B 0.1 [url removed, login to view] 3.8% 580

4) Data test function that employs an output from the third function and utilizes dependent and independent items in their respective folders to find out a) if regression did/did not took place when we expected it, b) if that regression happened with a positive or negative independent variable, c) whether we were expecting regression to go "up" or "down", d) given the return (from the historical data) of the dependent variable at Time C how much regression took place as a % of what we expected and e) what minimum and maximum values for the return we observed during the period Time B-Time C. We define a minimum predictability as to not have to test every combination. Output as CSV.

Example of function:

LinRegTest(LinRegFile,FolderPathDep,FolderPathIndep,FirstDate,LastDate,Time3,MinSDE)

Example output:

Independent Dependent ExpectedReturn ObeservedReturn Obs/Exp IndepPosNeg UpOrDown, MinRetValArray, MaxRetValArray, DateOfObservation

There will be four milestones, one for items 1 and 2 together, one for item 3, one for item 4, and one for final project completion after user feedback and tinkering.