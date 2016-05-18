Cloud Tensor Processing Units (TPUs) are now available in beta on the Google Cloud Platform (GCP).



The Cloud TPUs are a family of Google-designed hardware accelerators that are optimized to speed up and scale up specific machine learning (ML) workloads programmed with TensorFlow.



Google's TPUs are built with four custom ASICs. Each Cloud TPU packs up to 180 teraflops of floating-point performance and 64 GB of high-bandwidth memory onto a single board. The boards can be connected via an ultra-fast network to form multi-petaflop ML supercomputers. These "TPU pods" will be available on GCP later this year.



The Cloud TPUs can be programmed with high-level, open source TensorFlow APIs. GCP is making a number of reference Cloud TPU model implementations available, including:





ResNet-50 and other popular models for image classification

Transformer for machine translation and language modeling

RetinaNet for object detection









In a blog posting, Norm Jouppi, Distinguished Hardware Engineer at Google, discloses that the TPUs have already been in deployment in Google data centers for over a year, where they "deliver an order of magnitude better-optimized performance per watt for machine learning." The stealthy project to develop in-house silicon has been underway for several years.







Google plans to deliver machine learning as a service on its Google Cloud Platform by providing APIs for computer vision, speech, human language translation, etc.



Google adds that its cloud TPUs can also simplify the planning and management of ML computing resources. By purchasing the service, users benefit from the large-scale, tightly-integrated ML infrastructure that has been heavily optimized at Google over many years. The cloud TPUs are protected by the security mechanisms and practices that safeguard all Google Cloud services.