Disclosure: The views and opinions expressed here belong solely to the author and do not represent the views and opinions of crypto.news’ editorial.
For decades, self-driving vehicles have been the stuff of sci-fi movies, but now that we’ve outfitted vehicles with all sorts of sensors, chips, and software. Well, they still kind of are.
That’s not to say the industry is not moving forward—it is producing headlines for exciting new initiatives and concerning incidents. But even some of its pioneers question this progress, and the stories under these headlines feature the all too familiar headliners: Google’s Waymo, Apple, General Motors, among others. One would and should have expected to have more actual disruptors in a market as disruptive as this.
The truth is some of the underlying technologies in the stack of a driverless car heavily favor centralization and huge centralized players. Or so it seems at first sight.
Obviously, duct-taping a camera to a car won’t magically teach it to drive, but neither would hooking this camera to its onboard computer. As far as the computer goes, the camera feed is just another data flow. A human brain has an intricate system of neural connections extracting actionable insights from visual signals, and the computer needs something similar. It needs its own vision.
Computer vision is a subfield of the more extensive artificial intelligence (AI) industry, or machine learning (ML), to be more precise, that enables a driverless car to “see” the world around it. AI algorithms are often used to process other sensor feeds, such as LiDAR, enhancing the vehicle’s overall ability to navigate physical space. And the problem with such models is that they take absolutely gargantuan amounts of data to train.
Companies, wary of how far one can get with a simulated dataset, have long struggled with acquiring real-world data to train their models. The driverless vehicle industry is no exception. While companies can use simulations, a lot like what you see in video games, to record various scenarios and bootstrap their datasets, it only gets you so far. From weather conditions to regional specifics, real-world data is crucial for making self-driving cars safe and reliable—that’s why San Francisco residents can see driverless taxis cruise around with no passengers for hours. They’re not looking for passengers; they’re collecting data.
The challenge of collecting datasets of a sufficient scale and quality level at the speed that allows one to stay in business is an obstacle for the self-driving car industry—one that will keep the playing field uneven, pivoting it toward large centralized entities. Centralized giants get to collect troves of data, while the newcomers face a data challenge that hinders their progress. It casts a shadow of an oligopoly over a nascent and promising market, and we all know what that means for everyday people.
The solution is already out there, in the thousands of vehicles that drive down the roads of every city and every country every day. Most of them capture heaps of data on the go, and with the right incentive, the drivers would likely do the labeling themselves. Just look at CAPTCHA—the tests with pedestrians, motorcycles, and traffic lights are all exercises in data labeling that people perform to simply access a website or a service.
Accumulating all this data into vast sets will give up-and-coming startups and enterprises alike all the real-world learning materials their models may ever need. These datasets can be as diverse or location-specific as they need, rooted in real-world scenarios, conditions, and specifics. To unlock access to the data, though, the industry needs an entirely new data paradigm in the first place.
This paradigm must leverage blockchain as a shared and vendor-neutral core infrastructure and transaction layer to prevent the rise of another siloed ecosystem. It must also leverage self-sovereign data and identities for both drivers and vehicles, handing them back control over their data and privacy.
Self-sovereign identities will work as web3 wallets storing cryptographic proofs of various user attributes issued by trusted bodies, such as authorities or car manufacturers. Data consumers will be able to use those to verify the data that the sellers—who are the drivers in this case—can choose to put up for sale. The prospect is not nearly as far-fetched as it might seem, with both web2 and web3 companies already working on blockchain-powered mobility infrastructures such as Europe’s Gaia-X moveID.
This self-sovereign data paradigm will turn drivers into active stakeholders in the digital mobility space, enabling them to monetize the data they generate during their daily commutes. It will also solve the dataset challenge across the entirety of the self-driving vehicle industry, giving all of its participants equal access to a shared market for raw data, giving the industry a much-needed boost.
For all of their promise, truly autonomous self-driving vehicles remain elusive, partly because of the challenge associated with collecting the datasets to train the AI models that will help drive such cars. Embracing the web3 data paradigm is the industry’s best chance of unlocking access to a virtually unlimited pool of training data while also keeping up a healthy spirit of open competition.