Moving Machine Learning into the Next-Generation Cloud

As AI is getting increasingly popular today, Machine Learning (ML) is becoming essential for a lot of research areas, including computer vision, speech recognition, and natural language processing. Typically, deploying ML models at scale involves virtualization, networking, and infrastructure-related knowledge. The scale and complexity of ML workflows makes it hard to provision and manage resources – a burden for ML practitioners that hinders both their productivity and effectiveness. On the other hand, cloud computing has changed the way we model software and solutions. There are various first-generation serverful cloud services (e.g., IaaS, PaaS and SaaS) to provide efficient, economic and intelligent solutions for ML models. Recently, following the footsteps of traditional cloud computing, the next-generation cloud – “serverless” architectures, represented by AWS Lambda and Google Cloud Functions, have emerged as a burgeoning computation model to further reduce costs and improve manageability in cloud computing.

Machine learning is powering the next breed of built-in intelligent software, while serverless computing is redefining how we use cloud computing platforms to attain new levels of application development simplicity, efficiency and productivity. Due to its lightweight nature, ease of management, and ability to rapidly scale up, serverless computation has become the trend of building next-generation ML services and applications. In this project, we focus on an open challenge that how to take advantage of the serverless paradigm to deploy machine learning models, which allows us simplified deployments, avoids the need for infrastructure maintenance, and includes built-in scalability and cost-control. Current ML frameworks are generally specialized for coarse-grained VM-based clouds that do not have the flexibility required for serverless infrastructures. In this project, we propose a unified serverless computing framework which aims to flexibly, agilely and efficiently move ML into the next-generation cloud to achieve better simplicity, manageability and productivity. In particular, to bridge the semantic gap between the serverful ML model and the serverless cloud platform in terms of computation, communication and cost, we identify three major goals of this project, i.e., fine-grained computation management, efficient communication strategy and cost-effective service model.