So far, the ride-hailing app Uber’s growth has been extraordinary worldwide, supporting around 14 million trips each day. Till now, the company hasn’t discussed the architecture of its autonomous car in public. For the first time, Uber has disclosed the inner workings in great detail, complexities of developing the self-driving car in general, and much more.
VerCD – Tool to support ML workflow!
Due to several deep dependencies and development complexity of ML models, the company has created VerCD, a set of tools and microservices to support our ML workflow.
As per the company claims, VerCD allows the team to use automated continuous delivery (CD) to track and manage versioned dependencies of ML artifacts.
Uber in a blog post:
“ML teams developing models at scale may find that the practices and tools presented here as our five-step model life cycle and VerCD, developed at Uber ATG for self-driving vehicles, can apply to several use cases, helping them iterate on their infrastructure.”
According to the company, the bulk of engineering efforts has been spent adding company-specific integrations to empower existing orchestrators to interact with the heterogeneous set of systems throughout the full end-to-end ML workflow. It tracks all dependencies of each ML component, which often includes data and model artifacts in addition to code.
Logs collected from ATG’s self-driving cars!
The vast data sets that VerCD manages comes from logs collected by the ATG’s self-driving cars, and it includes images from cameras, lidar point and radar information, vehicle state, and map data. These are all divided into training data and validation data, where 15% goes to testing, 75% goes to training, and 10% to validation.
A tool called GeoSplit is used to select logs and split them between train, test, and validation based on their geographical location. “Once we’ve divided the data, we extract it from our data generation logs using Petastorm, Uber ATG’s open-source data access library for deep learning.”
Uses a hybrid approach to ML computing resources!
Upon user-registration of a new data set, the VerCD Data set Service stores the dependency metadata in our database. The datasets are identified by version number and name along with dependencies tracked by VerCD, allowing for the exact replication of sensor log IDs from autonomous vehicles, metadata describing data set lifecycle, and more.
Furthermore, Uber ATG uses a hybrid approach to ML computing resources, with training jobs running in on-premise data centers powered by GPU and CPU clusters as well as running training jobs in the cloud. Peloton, an open-source unified resource scheduler developed by Uber, is used to orchestrate training jobs using on-premise data centers with GPUs.
“Our ML model life cycle process and the tools that we’ve built to streamline it, such as VerCD, help us manage the many different models we use and iterate on models faster. These practices and solutions emerged from our need to work efficiently while developing an accurate self-driving vehicle system.”
“We have been able to make much progress by establishing the various workflow stages in ML development, and in turn, developing the supporting systems such as VerCD to manage the increased complexity of the ML workflow. As our technology continues to mature and increase in complexity and sophistication, relying on manual, human intervention to manage the ML workflow stages becomes increasingly infeasible. These tools enable engineers to make faster iterations of ML components, leading to higher performance of our self-driving vehicles.”
Main image picture credits: Uber
Stay tuned to Silicon Canals for more European technology news.