The NYC MTA: “Keeping Up” in the Age of Information

The MTA has historically been super sensitive about sharing data with third-party developers…Who reaps the benefits as they transition to open data?

I love New York City’s public transportation system. Having lived in California for 10+ years where getting from one side of a city to another takes at least an hour (and therefore most opt for cars instead), I’ve always marveled at NYC’s transit. One can get pretty much anywhere in NYC — from Brooklyn to the Financial District to Washington Heights —  within a short amount of time. Amazing. 

But I also loathe the New York City’s public transportation system. Have you ever taken the G train? Don’t. Unless you have absolutely nowhere to be or no-one to meet.

Bottom line: I can’t live with it, but I also can’t live without it. But I really, really, really wish I could live with it in a way that got me to work on time most days.

Disclaimer: I am not, nor will I ever claim to be, a transportation, operations, or data expert. But, in light of recent discussions on “open information” and “big data”, I decided to do some research. Given over 1 billion people ride the NYC subway each year alone, I figured there must be tons of data, and I was curious to see how the MTA is collecting and leveraging it — if at all — to make public transportation in NYC a better, safer, quicker ride for us all.


Finding #1:

First of all: Apparently I’m not the only one deeply entrenched in a love/hate relationship with the MTA. Case in point: Last year, 129,000 commuters requested “late-for-work” excuse slips from the MTA (and yes, that’s a thing).


Finding #2:

The MTA has a lot of data. A LOT of data. Think about it: every card swipe, every turnstile turn…the MTA has access to it all. But it was not until relatively recently in the organization’s history that it begin to actually utilize and analyze it. In particular, for the majority of its history, the MTA has struggled to keep up amidst the age of information. Historically, the MTA has maintained a very proprietary mindset and was involved in many lawsuits in the early millennium as a result. Bloggers, app developers, etc. worked to utilized MTA data to the benefit of the everyday user / consumer, but the MTA would sue for intellectual property infringement. 

During the past ten years, however, the MTA has taken its first steps towards embracing open data information. What this means:

  1. Real-time apps that will tell you when your bus / train will next arrive.
  2. Trip planner apps (i.e. on Google Maps) that allow you to plan your route, using public transportation, from one area of a metropolitan area to another.
  3. The list goes on.

Essentially, in enabling open data, third-party developers have been able to analyze and utilize the MTA’s data to the benefit of consumers, enabling a much more transparent and streamlined commute. Theoretically, this has also been to the benefit of the MTA as consumers are more likely to use the transportation system (as opposed to other options — i.e. driving).


Finding #3:

Increasingly, the MTA has also started to use its data internally to measure operational investments and efficiencies. Things like: commute speed, service frequency, and delay recovery time. The hope here is that, with time, the MTA (which has increasingly fewer and fewer resources and budget) will be able to identify which parts of the transit system need most work (and will most benefit consumers). This will save the organization money — and time!


Zara Leverages Data Analytics to Understand Consumer Tastes


Nielsen: Unique Opportunity with Big Challenges

Student comments on The NYC MTA: “Keeping Up” in the Age of Information

  1. Thanks for your post! I wonder whether MTA themselves have leveraged their own data and done something about it, vs. just relying on 3rd party developers. I recall I read about Boston city government being extremely tech-savvy relatively to other public bureaucracy. For example, they have an official app that you can report all the potholes you have found while you are driving. The app also harnesses the GPS function so that the government would know immediately where that pothole is and dispatch a team to fix it shortly. Meanwhile I think it is great to have 3rd party involved, but with the potential privacy concern they have and that they might not be able to share all the data, it is not a bad way for them to think what they can do about it.

  2. Hello! Every time I go to NYC I’m fairly shocked. I’m very spoiled by TFL’s tube in London. To get in and out people swipe their oysters cards. so TFL has a complete idea of the journey of each of its customer, which truly allows it to optimise its system aggregating all the data of the individual journeys (not using an oyster card is very taxing – tickets are twice as expensive). I was honestly surprised not seeing the same amount of data tracking in NYC – which would then allow to the MTA to base any improvement on the real need of those trying to cross through the city.

  3. Great post! I actually worked for the MTA for a couple of years, and saw its plethora of issues firsthand. The organization loses millions of dollars every year and is in a continual deficit. Most of its efforts thus far have been focused on cleaning up internal operations, consolidating back-office functions, and dealing with its excess employee base that can’t be fired because of union issues. Any leveraging of data to save money was largely focused on optimizing train schedules (more trains during peak hours and more express trains to stops with high employment during peak hours) and analyzing workarounds or alternate options for re-routing trains during train construction and maintenance periods. These uses of big data are basic, and they can/should certainly improve their efforts to leverage data for internal and external purposes to save money and improve the customer experience. How about push notifications to your phone when there is a shut down or delay in your regular morning train route? And internally, how can they use data to prevent costly mistakes? In 2008, the MTA removed station attendants at 100+ subway stops in an attempt to save money by not paying salaries. They subsequently lost $30M in fares by people who “jumped the turnstiles” to get in without paying because there were no station attendants on duty. Such a fail. They could have easily used big data to predict, track, and measure the impact of their decisions on consumer behavior and created instruments for increased surveillance and security within subway stations.

  4. Very interesting post! I totally agree with you and the other comments. It’s surprising that many cities seem to struggle with building strong data-driven models around their public transportation services. Although they have a huge infrastructure and the capability to collect huge amounts of data, we see the most innovative use of data only at younger companies like Uber or bike sharing services like Hubway in Boston. Given MTA’s data wealth and all the other data sources that are available in New York City, I believe that there is a lot to do in the future (e.g. dynamic and predictive schedules based on weather, special events, etc.). This smart use of data will probably be one of the most important steps to make the MTA profitable again in the near future.

  5. While I too am pleased to hear that the MTA is finally embracing the idea of third party app development, one other traditional concern that seems particularly relevant for providing unfettered access to transit data are the security concerns. The Port Authority of New York (including MTA) has always made it clear that its transit system was always target for acts of violence and terrorism, and there have several incidents in the history of the transit system that strengthen the argument of keeping transit data proprietary. I question whether there will be any reversion towards privatizing this data to prioritize safety concerns over transparency of service times and optimized trip planning.

  6. I wonder how the MTA can learn from other major cities both in the U.S. and abroad on how to better leverage this data. An interesting thing we talked about in my Supply Chain Mgmt class was RFID tags and how in Paris they are included in Metro cards. Essentially it allows patrons to board the trains without even having to scan and it also logs where people are traveling to and from, at what times, etc. I wonder how that kind of technology could be even more beneficial for the MTA and other major urban transportation systems to collect even better data.

Leave a comment