Temporal Preference Data Collection

Author: Benjamin Krenn
The data here was collected from multiple sources. The two main formats are .soi and .tsoi.
SOI was designed by PREFLIB.ORG and stands for incomplete strict orders. For a definition look here

TSOI (temporal incomplete strict order) is based on the SOI format but deanonymized i.e. the voter is named in some form before its own ranking. This format was designed to allow using voting data over a timeframe. Therefore each TSOI file in a data set uses ids for the alternatives that are consitant over all tsoi in the data set. This means the numbering is not neccessary from 1 to number of alternatives but can skip many numbers.
Each TSOI file is structured the following:
It starts with how many alternatives are available and then listing the alternatives/candidates
Number of alternatives
1, first alternative name
2, second alternative name
5, fifth alternative name (of the full data set here only the third)
...
Afterwards some data about the vote is given like how many voters, the total weight of the voters combined and the number of unique orders. In most cases those will all be the same number, especially unique orders was decided to take the same value as number of voters, as each voter will get its own row.
Number of voters, sum of vote count, number of unique orders

The votes are structured as following, but as each voter has it's own row the value for "count" will be 1 unless the voter has a higher weight:
voter: count, list of preferences
Example:
mr.x: 1, 2, 1
...

Every not listed alternative by a voter means it is last ranked for them.
It is also allowed to give a weight to each alternative in a ranking e.g.:
mr.y: 1, 5[400], 1[300], 2[5]
This weight needs to be consistent with the ordering of the alternatives from highest weight on first position to lowest weight on last position.

A not provided but possible file format would be TTOI (temporal incopmplete tied order). The difference is that similar to toi from PREFLIB.ORG tied alternatives are written within "{}" e.g.:
mr.z: 1, {5,2}[100], 1[10]
As additional info: Every file in all the collections has a date in their name to allow sorting from newest to oldest or the other way.

Eurovision Song Contest

Represents the jury voting data for the finals of the yearly Eurovision Song Contest. These are in general top-10 votes.
Every file represents another year with different candidates, but it was decided to only consider the country the contestant represents to allow for a more consistent alternative set.
This data was collected from https://data.world/datagraver/eurovision-song-contest-scores-1975-2019
The data set starts in the year 1975 and ends in 2019 for a total of 45 data points.

IPhone App Store Charts

These data sets represent the rankings of the Apple App Store for IPhone over a timespan of about 2 Month (2019-03-13 until 2019-05-15/ 62 data points). The charts are per region and each region has up to their top-200 apps as votes.
They were collected through https://appfollow.io/

IPhone Game Charts

Downloads:
Top Paid Games
tsoi (1.4 MB) soi (1.4 MB)
Top Free Games
tsoi (1.1 MB) soi (1.1 MB)
Top Grossing Games
tsoi (1.2 MB) soi (1.2 MB)

IPhone News Charts

Downloads:
Top Paid News Apps
tsoi (640 KB) soi (516 KB)
Top Free News Apps
tsoi (1.9 MB) soi (1.9 MB)
Top Grossing News Apps
tsoi (1.4 MB) soi (1.4 MB)

Spotify Charts

Spotify provides their charts on https://spotifycharts.com/regional.
They are categorised by daily or weekly and both of them for streaming number on their service or how viral the songs are online.
All of the data sets start with data from around the beginning of 2017 and up to the end of November 2019. This means the daily data sets have overr 1000 data points and the weekly ones around 150 each.

Viral Charts

These are top-50 charts (some votes have fewer than 50 alternatives ranked).
Downloads:
Viral Daily
tsoi (51.6 MB) soi (49.3 MB)
Viral Weekly
tsoi (7.6 MB) soi (7.3 MB)

Streaming Charts

These are top-200 charts (some votes have less than 200 alternatives ranked). For these data sets streaming numbers are available, but as they have no real place in SOI they are provided as weights in the TSOI files
Downloads:
Daily Charts
tsoi (166 MB) soi (135 MB)
Weekly Charts
tsoi (26.2 MB) soi (20.5 MB)

All the above tsoi as one download

This collection of all data sets is used for https://github.com/martinlackner/perpetual to allow experiments on real data.
Download:
tsoi (259 MB)