Microsoft & OpenAI Come Together To Build Massive AI Supercomputer In Azure

Navin Bondade 0 Comments Artificial Intelligence, Data Science, Deep Learning, Machine Learning

Last year in the month of July Microsoft has invested around 1 billion USD in the OpenAI, the startup trying to build a general artificial intelligence. At that time, Microsoft and OpenAI said they’d engage in an exclusive, multi-year partnership to build new Azure artificial intelligence supercomputing technologies.

At its 2020 Build developer conference, Microsoft has announced that it has made a partnership with OpenAI and has created one of the world’s top supercomputers on top of Azure’s infrastructure.

Microsoft officials said they’ve built the fifth most powerful publicly recorded supercomputer in collaboration with and exclusively for OpenAI. Microsoft further said that the 285,000-core machine would have ranked in the top five of the TOP500 supercomputer rankings.

This supercomputer is specifically for training massive distributed AI models. AI researchers believe that single, massive models will perform better than the smaller, separate AI models of the past.

Since Microsoft’s massive investment, OpenAI has made Azure its cloud of choice and this supercomputer was developed “with and exclusively for OpenAI.”

The companies say the Azure supercomputer will be used by OpenAI to train powerful new artificial intelligence models, the process of wiring up the virtual brain of an autonomous system.

“The computer is connected to Azure but is a dedicated resource of OpenAI’s. They paid for the system, paying both Microsoft and other suppliers. The total cost is not being disclosed,” a Microsoft spokesperson told me when asked for details. We’re also being told that the system is still running today.

Microsoft said the supercomputer built for OpenAI is a single system with more than 285,000 CPU cores; 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server. The supercomputer is hosted in Azure and has access to Azure services.

Microsoft has built its own family of large AI models, which it calls the Microsoft Turing models. These models have been used to improve language understanding across Bing, Office, Dynamics, and other products.

Microsoft has made publicly available what is believed to be the largest publicly available AI language model in the world: The Turning model for natural language generation.

Microsoft says the capabilities of the system allow it to process large amounts of data across many different areas, resulting in sophisticated models that go beyond traditional machine learning approaches focused on individual domains of knowledge.

“This is about being able to do a hundred exciting things in natural language processing at once and a hundred exciting things in computer vision, and when you start to see combinations of these perceptual domains, you’re going to have new applications that are hard to even imagine right now,” says Kevin Scott.

Officials said at Build that they are going to begin open-sourcing the Microsoft Turing models “soon,” as well as recipes for training them using Azure Machine Learning. Microsoft also is adding support for distributed training to its ONNX Runtime, an open-source library for making models portable across hardware and OS.

The ultimate goal is to use these new systems to solve major challenges in areas such as healthcare, climate, and education. In one high-profile example, Microsoft has joined with Amazon and others in a White House initiative that seeks to use supercomputers, AI and the cloud to address the coronavirus outbreak.

OpenAI describes itself as “research laboratory discovering and enacting the path to safe artificial general intelligence.” CEO Sam Altman says the new supercomputer is the company’s “dream system.”

Microsoft competes against Amazon, Google, and others that offer AI capabilities as an extension of their cloud platforms. Although the supercomputer is for OpenAI’s exclusive use, the Redmond company is looking to further burnish its AI credentials in the eyes of corporate customers.

The company calls it “a first step toward making the next generation of very large AI models and the infrastructure needed to train them available as a platform for other organizations and developers to build upon.”

OpenAI was formed in 2016 by Elon Musk, the Tesla and SpaceX CEO; Altman, the former Y Combinator president; Ilya Sutskever, OpenAI’s chief scientist; and Greg Brockman, the former Stripe CTO. Musk, who has sounded the alarm over the risks of AI, said last year that he was no longer involved in OpenAI.