Portfolio optimization using historical stock data

This example is from Section VI of Ollila and Raninen (2019). Running this code you are able to reproduce the Figure 7 and Figure 8 in the aforementioned paper.

We investigate the out-of-sample performance of global mean variance portfolios (GMVP-s) when based on different RSCM estimators. We use the divident adjusted daily closing prices downloaded from the Yahoo! Finance (http://finance.yahoo.com) database to obtain the net returns for 50 stocks that are currently included in the Hang Seng Index (HSI) for two different time periods, from Jan. 4, 2010 to Dec. 24, 2011, and from Jan. 1, 2016 to Dec. 27, 2017 (excluding the weekends and public holidays). In both cases, the time series contain T=491 trading days. For the first period (2010-2011), we had full length time series for only p=45 stocks, whereas in the latter case we had full length time series for all stocks, so p=50.

Our third time series contains the net returns of p=396 stocks from Standard and Poor's 500 (SP500) index for the time period from Jan. 4, 2016 to Apr. 27, 2018 (excluding the weekends and public holidays). In this case, the time series contains T=583 trading days.

Contents

Choose the data set

Analyze the 3rd data set (i.e. the Standard & Poor 500 data set)

choosedata = 'SP500';

If you want to obtain the results for Hang Seng Index (HDS) data for years 2010-2011 or 2016-2018 then replace the line above by one of the following lines, respectively:

choosedata = 'HSI_y2010';
choosedata = 'HSI_y2016';
test_len = 20; % window length for keeping the portfolio
switch choosedata
    case 'HSI_y2010'
       load HSIstocks2010to2011.mat; % net returns of p=45 HSI stocks
        T0 = 50;
        train_len_arr=T0:test_len:200; % period 2010-2011 HSI data
    case 'HSI_y2016'
        load HSIstocks2016to2017.mat;
        T0 = 130;
        train_len_arr=T0:test_len:270; % period 2016-2017 HSI data
     case 'SP500'
        load SP500stocks2016to2018.mat
        T0 = 70;
        train_len_arr=T0:test_len:290; % this for 2010-2011
end

Process the data

portfolio_data = stocksdata.Net_returns;
[n, p]   = size(portfolio_data);
nwins    = length(train_len_arr);
beta_ave = zeros(nwins,4);
risk     = zeros(nwins,5);
w        = zeros(p,5);
computew = @(M) ( M\ones(p,1) ) / ( ones(1,p)*(M\ones(p,1)) );

Start the sliding window stuff..

for j=1:nwins
    % OUTERLOOP iterates over different training window lengths

    train_len=train_len_arr(j); % choose the window length of training data
    step_size   = test_len;     % step size

    % indices for sliding window
    ind_list    = 1:step_size:n-train_len;  % start points of training data
    ind_list2   = train_len:step_size:n;    % end points of training data
    tmp = (n-train_len)/step_size;
    if  tmp ~= round(tmp)
        ind_list2 = [ind_list2 n];
    end

    % initialize
    n_ind = length(ind_list);
    beta  = zeros(n_ind,4);
    rets  = zeros(n,5);

    fprintf('\n%d ',j);

    for i=1:n_ind
        % INNERLOOP iterates the time period using the sliding window
        ind = ind_list(i);

        % Compute the train set and covariance matrices
        X_train = portfolio_data(ind:ind_list2(i),:);
        [RSCM1,~,stat] = regscm(X_train,'approach','ell1','verbose','off'); %;
        beta(i,1) = stat.beta;
        [RSCM2,~,stat2] =regscm(X_train,'approach','ell2','verbose','off','kappa',stat.kappa);
        beta(i,2) = stat2.beta;
        [LWE,~,stat4] =regscm(X_train,'approach','lw','verbose','off');
        beta(i,4) = stat4.beta;

        w(:,1) = computew(RSCM1);
        w(:,2) = computew(RSCM2);
        if (stat.gamma - stat2.gamma) < 0
            w(:,3) = w(:,1);
            beta(i,3)=beta(i,1);
        else
            w(:,3) = w(:,2);
            beta(i,3)=beta(i,2);
        end
        w(:,4) = computew(LWE);

        % compute the portfolio net returns
        X_test  = portfolio_data(1+ind_list2(i):ind_list2(i+1),:);
        rets((1+ind_list2(i)):ind_list2(i+1),:) = X_test*w;
        fprintf('.');

    end

    beta_ave(j,:) = mean(beta);
    risk(j,:) = std(rets(1+train_len:end,:),1);

end
1 ..........................
2 .........................
3 ........................
4 .......................
5 ......................
6 .....................
7 ....................
8 ...................
9 ..................
10 .................
11 ................
12 ...............

The plot the figures

figure(1); clf
K = sqrt(250); %
plot(train_len_arr,risk(:,1)*K,'bo-','Linewidth',1.8,...
    'DisplayName', 'Ell1-RSCM','MarkerSize',14)
hold on;
plot(train_len_arr,risk(:,2)*K,'r^-','Linewidth',1.8,...
    'DisplayName', 'Ell2-RSCM','MarkerSize',14)
plot(train_len_arr,risk(:,3)*K,'s-','Linewidth',1.8,...
    'DisplayName', 'Ell3-RSCM','MarkerSize',17,'Color',[0.4 0.4 0.4])
plot(train_len_arr,risk(:,4)*K,'k*-','Linewidth',1.8,...
    'DisplayName', 'LW-RSCM','MarkerSize',14)
xlabel('$n$','FontSize',12,'Interpreter','Latex')
ylabel('Annualized realized risk')
set(gca,'FontSize',22,'LineWidth',1.5)
switch choosedata
    case 'HSI_y2016'
        h=legend('show', 'Location', 'best');
        xlim([128 272]);
        ylim([0.074 0.093]);
    case 'HSI_y2010'
        h=legend('show', 'Location', 'southeast');
        xlim([40 200])
        ylim([0.12 0.128]);
        yticks([0.075 0.078 0.081 0.084 0.087]);
   case 'SP500'
        h=legend('show', 'Location', 'northeast');
        xlim([60 300]);
        ylim([0.075 0.087]);
end
legend('boxoff')
set(h,'FontSize',22)

figure(2); clf;
plot(train_len_arr,beta_ave(:,1),'bo-','Linewidth',1.8,...
    'DisplayName', 'Ell1-RSCM','MarkerSize',14)
hold on;
plot(train_len_arr,beta_ave(:,2),'r^-','Linewidth',1.8,...
    'DisplayName', 'Ell2-RSCM','MarkerSize',14)
plot(train_len_arr,beta_ave(:,3),'s-','Linewidth',1.8,...
    'DisplayName', 'Ell3-RSCM','MarkerSize',17,'Color',[0.4 0.4 0.4])
plot(train_len_arr,beta_ave(:,4),'k*-','Linewidth',1.8,...
    'DisplayName', 'LW-RSCM','MarkerSize',14)
h=legend('show', 'Location', 'southeast');
legend('boxoff')
set(h,'FontSize',22)
xlabel('$n$','FontSize',14,'Interpreter','Latex')
ylabel('$\hat \beta_o$','FontSize',14,'Interpreter','Latex')
set(gca,'FontSize',22,'LineWidth',1.5)
switch choosedata
    case 'HSI_y2016'
        xlim([128 272]);
        ylim([0.75 0.96]);
    case 'HSI_y2010'
        xlim([40 200])
        ylim([0.75 0.96])
    case 'SP500'
        xlim([60 300])
end

for more details, see our paper Ollila and Raninen (2019)

References