CSV (Comma Separated Value) File Format provider
Overview
CSV is a common, non-standardized file format, that most spreadsheet programs can read. Samples that are stored in a CSV file are written as standard text, in rows, where the values per channel are separated by a comma.This stream provider has a defined way how it writes values to a file, but is able to read more diverse CSV formats.
This file format has severe limitations, see below.
Details
Settings
This stream interface offers settings that specify how the interface should work. Those settings are available in the properties dialog as well as in variable parameters.TimeFormat
Default value: h:m:s.3
Defines how times are formatted in the Time column. For more information about number format codes, see "Number format codes"
NumberFormat
Default value: >.4
Defines using what format numerical values are stored. Please note that if you use a comma as separator, that you cannot store numbers with a decimal comma! For more information about number format codes, see "Number format codes"
IncludeEvents
Default value: Yes
Set to Yes if event markers in the signal should be stored. They will be written to the Events column. Set to No if event markers should be skipped.
Separator
Default value: ,
Defines the character that is used to separate columns in the CSV file. Values allowed are: ',', ';', 'tab' or 'space'.
InvalidValue
Default value:
Specifies a value that you know is an invalid value in the signal. If the input value is equal to this value, then the InvalidValueText is written instead. If InvalidValueText is empty, then InvalidValue is not evaluated.
InvalidValueText
Default value:
Specifies the text that is displayed in case an invalid value is detected at the input. See also the documentation of InvalidValue.
Description
Writing signals to a CSV file
If Polybench writes values to a CSV file, the resulting file starts with one row that specifies the columns in the file. The row looks like this (example):"Time","Events","Channel 1","Channel 2","Channel 3","Channel 4"
The first column is the time stamp of the sample. It looks like this:
0:00:41.946
0:00:41.948
0:00:41.950
0:00:41.952
0:00:41.954
The second column "Events" stores the event markers that are read from the signal. If multiple events exist for one sample, the events are appended to one string and separated by + symbols, like this:
"Marker 1+Marker 2+Marker 3"
The Events column is optional. You may leave events out by specifying No for IncludeEvents in the File Settings (see properties of the Storage operator).
The third and following columns contain the sampled values per channel.
Here another example of a CSV file that stores a signal from a measurement:
"Time","Events","Signal [unit]"
0:00:12.018,,0.113
0:00:12.020,,0.125
0:00:12.022,,0.138
0:00:12.024,,0.150
0:00:12.026,,0.163
0:00:12.028,,0.175
0:00:12.030,,0.187
0:00:12.032,,0.200
0:00:12.034,,0.212
0:00:12.036,,0.224
0:00:12.038,,0.236
If a channel has a unit, then the unit is written in square brackets behind the channel name.
The CSV file is written using the UTF-8 character encoding. The file does not have a BOM (byte order mark).
Reading signals from a CSV file
Polybench is able to read several CSV file formats. For Polybench to be able to interpret a CSV file, the following rules must be true:First line: Header
The first line may, but does not have to be a header line. In the header line the name of each data column is written.If a header-line exists, then the first column must be named 'Time' (case insensitive, so may also be 'time' or 'TIME'). If the name of the second column is 'Events', then that column is interpreted as an event marker column as described above. Otherwise, no event markers are assumed.
If no header is available, then the first column is interpreted as the Time column. The other columns are called C1, C2, C3, etc.
If behind channel names a pair of square brackets is detected, then the content between the brackets is used as channel unit. For example:
Time,EMG [uV]
0.018,10.94
0.020,11.0322
0.022,10.912
is shown in viewers as a file with one channel, called EMG with unit uV at 500 Hz.
(Note: In Polybench 1.30.0 and earlier the unit of channels were not stored)
Separating character
The name 'comma separated values' suggest that the columns in the data file are separated by a comma ',' character. However, they may also be separated by other characters:- comma and white-spaces, for example:
0:10:25.14 , 11.23 , -3.25-or-
0:10:25.14,11.23, -3.25
- semi-colon (with or without white-spaces), for example:
0:10:25.14; 11.23; -3.25
- tabs, for example:
0:10:25.14 11.23 -3.25
- spaces, for example:
0:10:25.14 11.23 -3.25but not multiple spaces.Wrong is:
0:10:25.14 11.23 -3.25
Text and values in quotation marks
Texts and values in the columns may be enclosed by quotation marks, for example: '0:10:25.14,"11.23","-3.25"'Time column interpretation
The times in the first column must be formatted according to any of the following formats, where h=hour, m=minute, s=second and f=fraction of a second:- h:m:s.f -or- h:m:s
- m:s.f -or- m:s
- s.f -or- s
The time must not contain a comma before the fraction (as may be the case in some European countries), so false is: '10:25,14', correct is '10:25.14'.
Evenly distributed time
Every line in the file is interpreted to be the next sample. The time in the time column is assumed to be the time of the previous line plus the sample time interval. The file is not interpreted correctly if there are missing samples.If the file has been recorded with a sample frequency greater than 1000 Hz, or a sample frequency that is not dividible by steps of 1 ms, then the time format for a newly recorded file should be set to have enough digits to describe the time. Otherwise the time difference between two lines may be equal or may be different than between two other lines. This is allowed.
So, for example, if you are storing a signal of 2000 Hz, then the time format must be set to at least h:m:s.4 (so four digits to be able to represent the 0.0005 sec sample interval times).
Technical file properties
For advanced users - the following file encodings can be interpreted:- UTF-7
- UTF-8 (with or without BOM)
- UTF-16 little endian and big endian
- UTF-32 little endian and big endian
ANSI encoding is interpreted as UTF-8; if characters greater than ASCII 127 are used, they may be displayed as small squares. This may affect channel names, units and event marker codes.
Please note that Polybench 1.30.0 and lower only interprets UTF-8 without BOM!