Working with Datasets

Importing data

Often, we would like to process data in MATLAB that comes from an external source and which is supplied in either .csv or .xls/.xlsx format. MATLABs importdata, csvread and xlsread commands allow us to easily read and process data files.

Reading .csv files with importdata or csvread

For the example below, please download the INDPRO series from https://fred.stlouisfed.org/series/INDPRO as a .csv file

The importdata command has the following syntax.

importdata(file, delimiter, skiprows) loads data from file where the data columns are separated with the delimiter char and numeric data starts at line skiprows+1.

A quick look at the INDPRO.csv file reveals that the numeric data starts in line 2 and columns are separated by commas (,).

Therefore, we can use the following command to read the industrial production series.

mydata = importdata('./INDPRO.csv', ',', 1);

The command creates a struct object called mydata which contains two sub-objects, a matrix called data containing all numerical data from the file (in this case the values of the series) and a cell-array called textdata which contains the string from the file (here header names and dates). We can explore these objects in the workspace browser or by accessing them directly.

mydata.data(1:10) % get first 10 rows of INDPRO series
mydata.textdata(1,:) % look at header names of the data

Alternatively we can also read in only the numeric data, by using the csvread command. csvread has the following syntax.

csvread(file, rowindex, columnindex) reads numeric data from file starting in row rowindex and column colindex where 0 is the first row/column.

If we only want to read the values of the INDPRO series (which starts in the second row and second column), we can use the following command.

indpro = csvread('./INDPRO.csv', 1, 1)

Reading .xls/.xlsx files with xlsread

Reading excel files works best on Windows. If you are using macOS it is best practice to use .csv files if they are available or to open the .xls file in excel and to export it in a .csv format. Files in a .xlsx format usually tend to work.

For the example below, please download the UNRATE series from https://fred.stlouisfed.org/series/UNRATE as an .xls file and save it as an .xlsx file using excel.

xlsread has the following syntax.

xlsread(file, sheetname, excelrange) reads data from file and worksheet sheet where the data is in the block excelrange (which is supplied in the typical excel format e.g. B2:F18).

In order to read the UNRATE series, we can use the following command.

unrate = xlsread('./UNRATE.xls', 'FRED Graph', 'B12:B858')

Alternatively, we can read both the numerical data along with all the text in the worksheet. We do this by specifying two outputs from xlsread.

[unrate, urtext] = xlsread('./UNRATE.xls', 'FRED Graph');

This returns the unemployment rate in the vector unrate, and a cell that contains the text data where the dates corresponding to the unemployment rate start in column 1, row 12.

Converting string dates with datetime

Often when importing time series data, dates or datetimes are supplied as strings in various formats. When using the importdata command, dates are thus treated as textdata and imported as vectors of strings. In order to work with dates to e.g. plot a series against them, we need to convert the date strings to the datetime format which MATLAB understands.

The datetime function can be used to convert an array of string dates of a specific format into an array of datetime values. The general syntax is as follows.

datetime_vector = datetime(datestring_vector, 'InputFormat', formatString)

formatString is a string that describes the date format of the string representation. Here are some examples of typically used date formats.

Consider the csv import example above where we have imported the INDPRO.csv file. The first column of the mydata.textdata array contains string representations of the dates of the time series starting in line 2. We can convert these dates into the datetime format in the following way.

inddates = datetime(mydata.textdata(2:end,1), 'InputFormat','yyyy-MM-dd')

These converted dates are useful e.g. when plotting the INDPRO series. Even though we will cover this more in detail in the next section, try out the following command to show a time series plot of the INDPRO series data.

plot(inddates, indpro)

We will see the following plot.

Similarly, we can make a time series plot of the unemployment rate by first converting the dates into the datetime format

urdates = datetime(urtext(12:end,1),'InputFormat','dd.MM.yyyy')

Then we can plot the unemployment rate using

plot(urdates, unrate)

Which gives us

Exporting data

Exporting data as a .mat file

The easiest way to export/save data in MATLAB is to export them as MATLAB datafile (which has a .mat extension).

As you have seen before in this course, .mat files can easily be loaded using the load command.

load mydata.mat

To save data from your MATLAB workspace into a .mat file you can use the save command. It has the following syntax.

save(filename) stores all variables in the current workspace into a file called filename.mat. save(filename, variables) stores the variables whose name is in the string list variables into a file called filename.mat.

Here are two examples.

save('mydata')              % Exports all variables in the workspace
save('mydata', 'X', 'y')    % Exports variables X and y

Saving data as .mat files is the most efficient solution when you would like to export the data to use them in a different MATLAB script or to access them later again with MATLAB. The downside is that .mat files are only readable by MATLAB. If you would like to export data in a format such that they can be imported into other programs, please read the next section.

Exporting data as a .csv file

Sometimes we would like to export our data in a format such that it can be imported into other programs which are not capable of reading .mat files.

If you would like to export numeric data that can be represented in a large matrix, MATLABs csvwrite function is a good option to write your data into a .csv file. It has the following syntax.

csvwrite('filename.csv', matrix) writes matrix matrix into a comma-separated values (.csv) file that is named filename.csv.

Here is an example.

X = [1, 2, 3; 4, 5, 6];
csvwrite('mydata.csv', X)

Note that .csv files which are exported this way do not contain column headers.

Exporting data as a .xls file

Alternatively, we can export our data to a .xls or .xlsx file using the xlswrite function. In general, xlswrite behaves just like csvwrite.

X = [1, 2, 3; 4, 5, 6];
xlswrite('mydata.xls', X)

Furthermore, we can use xlswrite to write excel file that contain column headers by first converting the output matix into a cell. As an example, lets read the data from UNRATE.xls, convert the dates to datetime, plot the time series and then save the time series to an excel file with a column for the dates and a heading for the dates and the unemployment rate.

[unrate, urtext] = xlsread('./UNRATE.xls', 'FRED Graph');
urdates = datetime(urtext(12:end,1),'InputFormat','dd.MM.yyyy')
plot(urdates, unrate)

col_label = {'Date', 'UNRATE'};
celldates = cellstr(datestr(urdates));
outputmat = [col_label; celldates num2cell(unrate)];
xlswrite('UNRATEout.xls', outputmat)

Last updated