According to https://cloud.google.com/tpu, each individual TPUv3 has 420 Teraflops, and TPUv4 is supposed to double that performance, so if that guess is correct, it should take a few seconds to do inference. Quite impressive really. - Source: Hacker News / about 2 years ago
You can also rent a cloud TPU-v4 pod (https://cloud.google.com/tpu) which 4096 TPUv-4 chips with fast interconnect, amounting to around 1.1 exaflops of compute. It won't be cheap though (excess of 20M$/year I believe). - Source: Hacker News / over 2 years ago
Actually, that's done with TPUs which are more efficient: https://cloud.google.com/tpu. Source: over 2 years ago
TPU training uses Google silicon and is thus a true deep learning alternative to Nvidia. Source: almost 3 years ago
The server choice really depends on how much CPU and RAM the requests take, how many users will be hitting the server, etc. You can start with a $5/month Digital Ocean server (or AWS or Google) and see if that works for you. Or you can outsource the server administration to Amazon or Google if you don't want to deal with it or need specialized tpu hardware. Source: about 3 years ago
Do you know an article comparing Google Cloud TPU to other products?
Suggest a link to a post with product alternatives.
This is an informative page about Google Cloud TPU. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.