Seeing is believing, especially when you are looking for your next car. That's why Edmunds—which helps more than 20 million web and mobile users each month browse automobile dealer inventory, read vehicle reviews, and consume other automobile-related content—considers its library of vehicle photos uploaded by auto dealers to be one of its most important assets.
"The mission of Edmunds is to make car buying easy," says Nitin Mahajan, the executive director for service engineering at Edmunds. He explains that the company wants its website and mobile apps to be the best place for consumers to complete the key steps in vehicle purchasing: identifying the right vehicle, evaluating trade-ins, analyzing price and financing offers, and finalizing the purchase. "But if vehicle photos don't look right or take too long to load, customers will disengage and move on,” Mahajan says.
Edmunds was planning a key update with even better image quality and faster load times on the company's website and mobile apps. Mahajan and his team had just a few weeks to figure out how to process the company's library of 50 million vehicle images into several new aspect ratios and resolutions, which would result in more than half a billion new images.
The company's existing image-handling solution, based on Cloudera MapReduce clusters, wasn't the right tool for the job, according to Mahajan. "Development time would have taken too long, and achieving sufficient scale and processing power for this project would have required us to manage new clusters and incur new monthly costs of at least $10,000.”
Edmunds is all-in on Amazon Web Services (AWS) and has been expanding its use of serverless solutions orchestrated by AWS Lambda, a function-based service that runs code in response to events. "What we love about AWS Lambda is that it is so easy to experiment with and see results quickly, and this project is a case in point," says Mahajan. "We tasked an engineer with seeing how well a serverless solution directed by AWS Lambda would work for this project. He came up with the basic solution within hours, and we were evaluating results by the end of the day."
The serverless solution the team created uses Amazon Simple Storage Service (Amazon S3) for highly available object storage, with AWS Lambda functions triggered by an AWS API Gateway endpoint. It also includes Amazon Athena, a serverless, interactive service that uses standard SQL to query large data sets by taking advantage of the query-in-place functionality of Amazon S3, avoiding the need to move the data to a separate analytics platform. By using Amazon S3 Standard storage, Edmunds achieves 99.999999999 percent data durability and replication across three AWS Availability Zones (AZs), so the company’s data would remain available even in the event of the destruction of an entire AZ’s AWS data centers.
By using a serverless image-processing solution built on AWS, the Edmunds service engineering team beat the release deadline and avoided the higher costs of a traditionally architected solution. "The project was completed on deadline for a one-time cost of $6,000, compared to the monthly charges of at least $10,000 we would have incurred if we had needed to purchase, provision, and configure new resources," says Mahajan. "Also, because we could scale to thousands of AWS Lambda function invocations and increase the memory allocation with just a few clicks, we needed only eight days to process all 50 million images into 700 million new images.”
The simplicity and flexibility of connecting services on AWS played a key role in how quickly Mahajan's team was able to put the solution into production. "What took us just a few days to build using a serverless solution based on AWS Lambda would have taken us six months to build from scratch,” says Mahajan. “Our CTO and the rest of the project stakeholders were really happy with how much money and time we saved."
Time savings on the project’s primary goal enabled the team to build in additional value, such as a new archiving feature. "We didn't have an efficient way to get rid of images of vehicles that are no longer listed for sale," says Mahajan. "It was fast and easy to add an AWS Lambda function to automate ongoing deletions of all the various sizes and aspect ratios of inactive photos, retaining only the originals. In just a week, we went from 1.5 billion to 800 million photos in our Amazon S3 bucket, a decrease from 60 to 40 terabytes."
Mahajan and his team are also now ready to supply any new image transformations that the business side requests. "With a serverless solution that uses AWS Lambda and Amazon S3, we are well prepared for future eventualities," says Mahajan. "If the business side requests new sizes of the library of vehicle photos, we can increase our AWS Lambda pool and get the job done in just a few days."
Edmunds will continue building similar solutions because the simplicity and flexibility of serverless on AWS provide the business with vital agility. "Being able to move fast is the key to success in business today," says Mahajan. "Using AWS Lambda, I can test my code really quickly, without worrying about servers."
The next phase of this project will take advantage of the ease of integrating AWS services. "There is information in our photos that is too time-consuming to obtain manually," says Mahajan, listing examples like vehicle colors, the orientation of vehicles in given images, and visible indicators of vehicle condition. "One of the best things about a serverless solution on AWS is how easily we could hook it to machine-learning platforms like Amazon Sagemaker and frameworks like Tensorflow or Caffe so we can automatically capture new image metadata to reduce processing time and get to market even faster."
Mahajan's ultimate takeaway is simple to state but underlines the fundamental value proposition of AWS serverless. "With a serverless solution built on Amazon S3, we don't need to worry about infrastructure and can stay focused on products, code, and solving business problems."
Learn more about serverless solutions from AWS.