Vehicle Probes and Crowdsourced Data

2020 Executive Briefing

Date posted: December, 2020
Last updated: February, 2023

Cover Page for the Vehicle Probes and Crowdsourced Data 2020 Executive Briefing


  • Indiana DOT and Utah DOT’s use of crowdsourced data has helped reduce detour driving time and save funds in weather surveillance costs.
  • Although agencies detected major collisions first, Waze reported minor incidents, debris, and disabled vehicles an average of 3-16 minutes before state agencies.
  • The cost to design and implement a custom crowdsourcing tool can range from $50,000 to $250,000.



The effective management and control of transportation systems can be highly dependent on available data sources that can accurately reflect system performance and the state of system controls. Agencies that operate and maintain these networks continually seek new data acquisition technology and advanced processing techniques to improve data quality, expand coverage, and minimize costs. Agencies must acquire, process, and integrate data from a wide variety of sources including data manually collected by agency personnel, field equipment, partnering agencies, and private sector providers. Additionally, agencies must accommodate evolving strategies that leverage high-speed fiber-optic networks and advanced wireless communications that actively exchange data in real-time for connected vehicles (CV), travelers, roadside devices, and system operators.

Recently, advances in wireless communications and data acquisition technology have expanded the potential design space for transportation planners and engineers that develop and implement innovative solutions that use crowdsourced and probe data.

Screenshot of the interface of the Waze smartphone application, with menu options shown for Traffic, Police, Crash, Hazard, Map chat, Map issue, Place, Roadside Help, Camera and Closure.
Figure 1: Screenshot. Waze smartphone application. Source: Waze

Probe data can be collected from a diverse range of entities including a full range of light vehicles, roadway and rail transit vehicles, and freight carriers. Additionally, any person with a smartphone can act as a probe to improve coverage in areas where vehicle probe data are limited.

Crowdsourced data typically consist of traveler reported traffic and incident data collected in real-time from social media platforms (e.g., Facebook, Twitter, etc.) mobile smartphone apps (e.g., Google Maps, Waze, etc.) and third party crowdsource data providers (e.g., Amazon Mechanical Turk, etc.). These data can be passively or actively transmitted, quantitative or qualitative in nature, and impart information on traffic speeds, travel times, road conditions, incident types, and/or transit services.


The integration of traditional infrastructure-based data sources with vehicle probe and crowdsourced data is expected to enhance current forms of transportation system management while reducing costs for agency operations, improving traveler mobility and productivity, and reducing environmental impacts.

Mobility and Productivity. Crowdsourcing enables agency staff to provide better traveler information and develop more proactive and effective operational strategies that can lead to reduced traffic congestion.

  • In Indiana, dashboards built with real-time crowdsourced data are used to proactively identify congestion problems as well as measure impacts of mitigation measures. Indiana DOT’s use of real-time probe vehicle data to manage unplanned detour routes helped them reduce detour driving time by as much as 75 percent (2019-01425).

Cost Savings. Crowdsourcing can be cost-effective and limit the need for additional roadway sensors and equipment that require costly installation and maintenance.

  • Crowdsourcing has significantly expanded Utah's geographic coverage, density, and accuracy of data for road weather management and traveler information. Utah Department of Transportation’s (UDOT) Citizen Reporter app, which utilizes crowdsourced road condition data, has saved the state $250,000 annually in weather surveillance costs (2019-01424).
  • The Kentucky Transportation Cabinet (KYTC) uses Waze data and real-time speed data to monitor traffic speeds during severe winter weather events and gauge the effectiveness of snow and ice removal activities. In addition, operators use these data to support decision-making with respect to the content of warning messages or instructions that need to be posted on dynamic message signs. In 2016, KYTC phased out its telephone-based 511 system and partnered with Waze crowdsourced data services to provide traveler information, saving the agency $750,000 per year (2020-01436).



Many crowdsourcing smartphone applications such as Facebook, Twitter, Google Maps, Waze, OneBusAhead, and Moovit are free, but agencies may consider creating their own. The cost to design and implement a custom crowdsourcing tool can range from $50,000 to $250,000. This typically includes the purchase and installation of system hardware, software, and other required equipment. Estimates for operations and maintenance (O&M) costs range from $5,000 to $50,000 per year (2020-00447).

A number of state agencies have developed crowdsourcing data collection tools to support citizen reporter programs. These programs enable road weather observations to be reported in real-time by public and private sector volunteers that frequently travel on specific roadway segments. Example cost drivers for these systems are highlighted below (2020-00449).

  • As part of a statewide citizen reporter program, the Idaho Transportation Department (ITD) developed a web interface for $65,000 that enables citizens to enter reports. The Minnesota Department of Transportation (MnDOT) created a similar web interface for $63,700.
  • UDOT developed an Android and iOS smartphone application for $120,000 that enables citizens to enter reports.
  • The FHWA Pooled Fund North/West Passage project found that operations and maintenance costs for its citizen reporting program ranged from $5,000 to $13,000 per year


Best Practices

Freeway management, incident management, road condition reporting, and traveler information services can all benefit from the introduction of crowdsourced data into transportation management center (TMC) operations. Organizations and agencies deploying these systems, however, should expect significant challenges. Several of these challenges are highlighted below. Vehicle Probes and Crowdsourced Data4

  • Identify special skills needed to upgrade legacy data management systems. Agencies planning to upgrade legacy data management systems should invest in IT infrastructure and upgrade staff skills to handle large crowdsourced data sets (2020-00931).
  • Identify quality, availability, and latency requirements needed. Agencies planning to acquire crowdsourced data to support real-time traveler information systems should clearly specify temporal and spatial coverage and sample size requirements since proprietary products will likely have limited transparency (2020-00933).
  • Many social media contributions are not actively monitored in real-time by DOTs or TMCs. TMCs would need to establish new processes to manage information flow.
    Be aware of copyright and data ownership limitations. To address federal, state, and local open records laws, agencies should recommend specifying a data-sharing plan when acquiring crowdsourced data (2020-00932).
  • Ensure data validity and reliability. Many TMCs rigorously validate information prior to making it available to the public via traveler information tools. Anonymous and non-agency-generated data will affect these processes. TMCs will need to focus on implementing standard validation procedures for these data1.


Case Study

Although crowdsourced data is susceptible to redundant and unreliable information such as false-positive congestion events, it can expand coverage for individual agencies that have limited resources available for traffic surveillance. In addition, crowdsourced data can be timelier than agency-generated data, due to the fact that travelers are more likely to come across an incident compared to an agency that must limit placement of sensors, cameras, and field patrols. With improved timeliness, counter measures can be implemented more quickly to reduce the likelihood of secondary crashes and improve travel reliability across the network. The ICC conducted a study to show the average time lag for an agency to detect and report an event compared to Waze. The findings below show the average length of time between Waze and DOT reported events, and the percentage of Waze events that were captured by DOTs (includes events in California, Florida, and Virginia). The findings suggest that crowdsourced data can be timelier than agency-generated data, enabling TMCs to respond to minor incidents and developing congestion 3 to 16 minutes sooner (2020-01437).

Google map of Washington DC, with Waze data overlayed.
Figure 2: Screenshot. Realtime Waze data integrated into the Regional Integrated Transportation Information System platform. Source: UMD CATT Lab
Table 1: Comparison of Waze versus DOT event reporting2
Type of Event Average Time a Waze Event was Reported Before DOT Reporting Percentage of Waze Events Included in the DOT ATMS* Logs
Freeways/Ramps Crashes 3 minutes 40%
Primary/Secondary Crashes 3 minutes 12%
Freeways/Ramps Disabled Vehicles 14 minutes 37%
Primary/Secondary Disabled Vehicles 16 minutes  4%



  1. Noblis. “Estimate Benefits of Crowdsourced Data from Social Media.” Project Work Plan (Internal Report), Noblis. February 24, 2014.
  2. USDOT. “Considerations of Current and Emerging Transportation Management Center Data.” USDOT FHWA, Report No. FHWA-HOP-18-084. July 2019.
File type: PDF
Executive Briefing