AssetsModelling {fPortfolio} | R Documentation |
A collection and description of functions which
generate multivariate artificial data sets of assets,
which fit the parameters to a multivariate normal,
skew normal, or (skew) Student-t distribution and
which compute some benchmark statistics. In addition
a function is provided which allows for the selection
and clustering of individual assets from portfolios
using hierarchical and k-means clustering approaches.
The functions are:
assetsSim | Simulates a data set of assets, |
assetsSelect | Asset Selection from Portfolios, |
assetsFit | Fits the parameter of a data set of assets, |
assetsStats | Computes benchmark statistics of asset sets, |
assetsMeanCov | Computes mean and covariance matri, |
assetsTest | Test for multivariate Normal distribution, |
print | S3 print method for an object of class 'fASSETS', |
plot | S3 Plot method for an object of class 'fASSETS", |
summary | S3 summary method for an object of class 'fASSETS'. |
assetsSim(n, dim = 2, model = list(mu = rep(0, dim), Omega = diag(dim), alpha = rep(0, dim), df = Inf), assetNames = NULL) assetsSelect(x, method = c("hclust", "kmeans"), kmeans.centers = 5, kmeans.maxiter = 10, doplot = TRUE, ...) assetsFit(x, method = c("st", "snorm", "norm"), title = NULL, description = NULL, fixed.df = NA, ...) assetsMeanCov(x, method = c("cov", "mve", "mcd", "nnve", "shrink", "bagged"), check = TRUE, force = TRUE, baggedR = 100, ...) assetsStats(x) assetsTest(x, method = c("shapiro", "energy"), Replicates = 100, title = NULL, description = NULL) ## S3 method for class 'fASSETS': print(x, ...) ## S3 method for class 'fASSETS': plot(x, which = "ask", ...) ## S3 method for class 'fASSETS': summary(object, which = "all", ...)
assetNames |
[assetsSim] - a vector of character strings of length dim allowing
for modifying the names of the individual assets.
|
baggedR |
[assetsMeanCov] - an integer value, the number of bootstrap replicates, by default 100. This value is only used if method="bagged" .
|
check |
[assetsMeanCov] - a logical flag. Should the covariance matrix be tested to be positive definite? By default TRUE .
|
description |
[assetsFit] - a character string, assigning a brief description to an "fASSETS" object.
|
doplot |
[assetsSelect] - a logical, should a plot be displayed? |
fixed.df |
[assetsFit] - either NA , the default, or a numeric value assigning the
number of degrees of freedom to the model. In the case that
fixed.df=NA the value of df will be included in the
optimization process, otherwise not.
|
force |
[assetsMeanCov] - a logical flag. Should the covariance matrix be forced to be positive definite? By default TRUE .
|
kmeans.centers |
[assetsSelect] - either the number of clusters or a set of initial cluster centers. If the first, a random set of rows in x are chosen as the
initial centers.
|
kmeans.maxiter |
[assetsSelect] - the maximum number of iterations allowed. |
method |
[assetsFit] - a character string, which type of distribution should be fitted? method="st" denotes a multivariate skew-Student-t distribution,
method="snorm" a multivariate skew-Normal distribution, and
method="norm" a multivariate Normel distribution.
By default a multivariate normal distribution will be fitted to the
empirical market data.[assetsMeanVar] - a character string, whicht determines how to compute the covariance matix. If method="cov" is selected then the standard
covariance will be computed by R's base function cov , if
method="shrink" is selected then the covariance will be
computed using the shrinkage approach as suggested in Schaefer and
Strimmer [2005], if method="bagged" is selected then the
covariance will be calculated from the bootstrap aggregated (bagged)
version of the covariance estimator.[assetsSelect] - a character string, which clustering method should be applied? Either hclust for hierarchical clustering of dissimilarities,
or kmeans for k-means clustering.[assetsTest] - a character string, which the selects which test should be applied. If method="shapiro" then Shapiro's multivariate Normality
test will be applied as implemented in R's contributed package
mvnormtest . If method="energy" then the E-statistic
(energy) for testing multivariate Normality will be used as proposed
and implemented by Szekely and Rizzo [2005] using parametric
bootstrap.
|
model |
[assetsSim] - a list of model parameters: mu a vector of mean values, one for each asset series, Omega the covariance matrix of assets, alpha the skewness vector, and df the number of degrees of freedom which is a measure for
the fatness of the tails (excess kurtosis). For a symmetric distribution alpha is a vector of zeros.
For the normal distributions df is not used and set to
infinity, Inf . Note that all assets have the same value
for df .
|
n, dim |
[assetsSim] - integer values giving the number of data records to be simulated, and the dimension of the assets set. |
object |
[summary] - An object of class fASSETS .
|
Replicates |
[assetsTest] - an integer value, the number of bootstrap replicates, by default 100. This value is only used if method="energy" .
|
title |
[assetsFit] - a character string, assigning a title to an "fASSETS" object.
|
which |
which of the five plots should be displayed? which can
be either a character string, "all" (displays all plots)
or "ask" (interactively asks which one to display), or a
vector of 5 logical values, for those elements which are set
TRUE the correponding plot will be displayed.
|
x |
[assetsFit][assetsStats][assetsMeanVar] - a numeric matrix of returns or any other rectangular object like a data.frame or a multivariate time series object which can be transformed by the function as.matrix to an object of
class matrix .
[plot][print] - An object of class fASSETS .
|
... |
optional arguments to be passed. |
Data sets of assets x
can be expressed as multivariate
'timeSeries' objects, as 'data.frame' objects, or any other rectangular
object which can be transformed into an object of class 'matrix'.
Parameter Estimation:
The function assetsFit
for the parameter estimation and
assetsSim
for the simulation of assets sets use code based on
functions from the contributed packages "mtvnorm"
and "sn"
.
The required functionality for fitting data to a multivariate Normal,
skew-Normal, or skew-Student-t is available from builtin functions, so
it is not necessary to load the packages "mtvnorm"
and "sn"
.
Assets Mean and Covariance:
The function assetsMeanCov
computes the mean vector and covariance
matrix of an assets set. For the covariance matrix one can select from
three choicdes: The standard covariance computation through R's base
function cov
and a shrinked and bagged version for the covariance.
The latter two choices implement the covariance computation from the
functions cov.shrink()
and cov.bagged()
which are part
of the contributed R package corpcov
.
Assets Statistics:
The function assetsStats
implements benchmark formulas and
statistics as reported in the help page of the hedge fund software
from www.AlternativeSoft.com. The computed statistics are listed
in the 'Value' section below. Note, that the functions were written for
monthly recorded data sets. Be aware of this when you use or generate
asset sets on different time scales, then you have to scale them
properly.
Assets Selection:
The function assetsSelect
calls the functions hclust
or kmeans
from R's "stats"
package. hclust
performs a hierarchical cluster analysis on the set of dissimilarities
hclust(dist(t(x)))
and kmeans
performs a k-means
clustering on the data matrix itself.
Assets Tests:
The function assetsTest
performs two tests for multivariate
Normality of an assets Set.
assetsFit
returns a S4 object class of class "fASSETS"
, with the following
slots:
@call |
the matched function call. |
@data |
the input data in form of a data.frame. |
@description |
allows for a brief project description. |
@fit |
the results as a list returned from the underlying fitting function. |
@method |
the selected method to fit the distribution, one
of "norm" , "snorm" , "st" .
|
@model |
the model parameters describing the fitted parameters in
form of a list, model=list(mu, Omega, alpha, df .
|
@title |
a title string. |
@fit$dp |
a list containing the direct parameters beta, Omega, alpha.
Here, beta is a matrix of regression coefficients with
dim(beta)=c(nrow(X), ncol(y)) , Omega is a
covariance matrix of order dim , alpha is
a vector of shape parameters of length dim .
|
@fit$se |
a list containing the components beta, alpha, info. Here, beta and alpha are the standard errors for the corresponding point estimates; info is the observed information matrix for the working parameter, as explained below. |
fit@optim |
the list returned by the optimizer optim ; see the
documentation of this function for explanation of its
components.
|
Note that the @fit$model
slot can be used as input to the
function assetsSim
for simulating a similar portfolio of
assets compared with the original portfolio data, usually market
assets.
assetsMeanCov
returns a list with two entries named mu
and Sigma{Sigma}.
The first denotes the vector of assets means, and the second the
covariance matrix. Note, that the output of this function can be
used as data input for the portfolio functions to compute the
efficient frontier.
assetsSelect
if method="hclust"
was selected then the function returns a
S3 object of class "hclust", otherwise if method="kmeans"
was
selected then the function returns an obkject of class list. For
details we refer to the help pages of hclust
and kmeans
.
assetsSim
returns a matrix, the artifical data records represent the assets
of the portfolio. Row names and column names are not created, they
have to be added afterwards.
assetsStats
returns a data frame with the following entries per column and asset:
Records
- number of records (length of time series),
paMean
- annualized (pa, per annum) Mean of Returns,
paAve
- annualized Average of Returns,
paVola
- annualized Volatility (standard Deviation),
paSkew
- Skewness of Returns,
paKurt
- Kurtosis of Returns,
maxDD
- maximum Drawdown,
TUW
- Time under Water,
mMaxLoss
- Monthly maximum Loss,
mVaR
- Monthly 99
mModVaR
- Monthly 99
mSharpe
- Monthly Sharpe Ratio,
mModSharpe
- Monthly Modified Sharpe Ratio, and
skPrice
- Skewness/Kurtosis Price.
assetsTest
returns an object of class fHTEST
.
Adelchi Azzalini for R's sn
package,
Torsten Hothorn for R's mtvnorm
package,
Juliane Schaefer and Korbinian Strimmer for R's corpcov
package,
Alan Ganz and Frank Bretz for the underlying Fortran Code,
Maria Rizzoand Gabor Szekely for R's energy
package,
Diethelm Wuertz for the Rmetrics port.
Azzalini A. (1985); A Class of Distributions Which Includes the Normal Ones, Scandinavian Journal of Statistics 12, 171–178.
Azzalini A. (1986); Further Results on a Class of Distributions Which Includes the Normal Ones, Statistica 46, 199–208.
Azzalini A., Dalla Valle A. (1996); The Multivariate Skew-normal Distribution, Biometrika 83, 715–726.
Azzalini A., Capitanio A. (1999); Statistical Applications of the Multivariate Skew-normal Distribution, Journal Roy. Statist. Soc. B61, 579–602.
Azzalini A., Capitanio A. (2003); Distributions Generated by Perturbation of Symmetry with Emphasis on a Multivariate Skew-t Distribution, Journal Roy. Statist. Soc. B65, 367–389.
Breiman L. (1996); Bagging Predictors, Machine Learning 24, 123–140.
Genz A., Bretz F. (1999); Numerical Computation of Multivariate t-Probabilities with Application to Power Calculation of Multiple Contrasts, Journal of Statistical Computation and Simulation 63, 361–378.
Genz A. (1992); Numerical Computation of Multivariate Normal Probabilities, Journal of Computational and Graphical Statistics 1, 141–149.
Genz A. (1993); Comparison of Methods for the Computation of Multivariate Normal Probabilities, Computing Science and Statistics 25, 400–405.
Hothorn T., Bretz F., Genz A. (2001); On Multivariate t and Gauss Probabilities in R, R News 1/2, 27–29.
Ledoit O., Wolf. M. (2003); ImprovedEestimation of the Covariance Matrix of Stock Returns with an Application to Portfolio Selection, Journal of Empirical Finance 10, 503–621.
Rizzo M.L. (2002); A New Rotation Invariant Goodness-of-Fit Test, PhD dissertation, Bowling Green State University.
Schaefer J., Strimmer K. (2005); A Shrinkage Approach to Large-Scale Covariance Estimation and Implications for Functional Genomics, Statist. Appl. Genet. Mol. Biol. 4, 32.
Szekely G.J., Rizzo, M.L. (2005); A New Test for Multivariate Normality, Journal of Multivariate Analysis 93, 58–80.
Szekely G.J. (1989); Potential and Kinetic Energy in Statistics, Lecture Notes, Budapest Institute of Technology, TechnicalUniversity.
MultivariateDistribution
,
hclust
and kmeans
.
## Not run: ## SOURCE("fPortfolio.101A-AssetsModelling") ## berndtInvest - xmpPortfolio("\nStart: Load monthly data set of returns > ") data(berndtInvest) # Exclude Date, Market and Interest Rate columns from data frame, # then multiply by 100 for percentual returns ... berndtAssets = berndtInvest[, -c(1, 11, 18)] rownames(berndtAssets) = berndtInvest[, 1] head(berndtAssets) ## assetsSelect - xmpPortfolio("\nNext: Select 4 most dissimilar assets from hclust > ") clustered = assetsSelect(berndtAssets, doplot = FALSE) myAssets = berndtAssets[, c(clustered$order[1:4])] colnames(myAssets) # Scatter and time series plot: par(mfrow = c(2, 1), cex = 0.7) plot(clustered) myPrices = apply(myAssets, 2, cumsum) ts.plot(myPrices, main = "Selected Assets", xlab = "Months starting 1978", ylab = "Price", col = 1:4) legend(0, 3, legend = colnames(myAssets), pch = "----", col = 1:4, cex = 1) ## assetsStats - if (require(fBasics)) assetsStats(myAssets) ## assetsSim - xmpPortfolio("\nNext: Fit a Skew Student-t > ") fit = assetsFit(myAssets) # Show Model Slot: fit @model # Simulate set with same properties: set.seed(1953) simAssets = assetsSim(n = 120, dim = 4, model = fit@model) head(simAssets) simPrices = apply(simAssets, 2, cumsum) ts.plot(simPrices, main = "Simulated Assets", xlab = "Number of Months", ylab = "Simulated Price", col = 1:4) legend(0, 3, legend = colnames(simAssets), pch = "----", col = 1:4, cex = 1) ## plot - xmpPortfolio("\nNext: Show Simulated Assets Plots > ") if (require(fExtremes)) { # Show Scatterplot: par(mfrow = c(1, 1), cex = 0.7) plot(fit, which = c(TRUE, FALSE, FALSE, FALSE, FALSE)) # Show QQ and PP Plots: par(mfrow = c(2, 2), cex = 0.7) plot(fit, which = !c(TRUE, FALSE, FALSE, FALSE, FALSE)) } ## End(Not run)