AnalytiCup: Alibaba E-Commerce AI Challenge
|Alibaba Session Day Schedule
9:30AM – 6:30PM, Nov 5, Room 302B Chaired by Dr. Hongxia Yang (Alibaba Group, DAMO Academy)
9:30AM - 10:30AM
Knowledge Graph Construction and Application in Practice
Yanghua Xiao (Fudan University)
Since its birth in 2012, Knowledge Graph has achieved enormous progress and become one of the most important knowledge representation methods in the era of big data. It has been recognized as one of the most typical knowledge representations of knowledge engineering in big data and a core technology for the realization of cognitive intelligence, which significantly helps promote the development and application of artificial intelligence. In the past few years, knowledge graph has been widely used in a variety of large-scale applications. In this talk, I will introduce the main endeavors and key technologies in the automated construction and application of large-scale knowledge graphs. In particular, I will focus on low-cost automated knowledge acquisition and knowledge-guided applications including semantic search and intelligent recommendation in specific scenarios such as e-commerce.
10:30AM - 11:00AM
Large-scale Image Search and Recognition
Pan Pan (Alibaba Group, DAMO Academy)
Large scale image search and recognition are the fundamental topics in computer vision, and have wide range of applications in industries. In this talk, first we will introduce the motivation and challenges in large scale image search and recognition. After that we will illustrate our technology innovations. In the end, the applications of those technologies in Alibaba will be shown.
11:00AM - 11:30AM
Multi-View Multi-Label Learning with View-Specific Information Extraction
Xiaobo Wang (Alibaba Group, Youku)
Multi-view multi-label learning serves an important framework to learn from objects with diverse representations and rich semantics. Existing multi-view multi-label learning techniques focus on exploiting shared subspace for fusing multi-view representations, where helpful view-specific information for discriminative modeling is usually ignored. We proposed a novel multi-view multi-label learning approach named SIMM which leverages shared subspace exploitation and view-specific information extraction. For shared subspace exploitation, SIMM jointly minimizes confusion adversarial loss and multi-label loss to utilize shared information from all views. For view-specific information extraction, SIMM enforces an orthogonal constraint w.r.t. the shared subspace to utilize view-specific discriminative information. Extensive experiments on youku and several public data sets clearly show the favorable performance of SIMM against other state-of-the-art multi-view multi-label learning approaches.
The Technology and Application of Machine Translation Technology for E-Commerce in Alibaba
Weihua Luo (Alibaba Group, DAMO Academy)
Machine translation is an essential service for cross-border e-commerce. The international business of Alibaba, such as Alibaba.com, AliExpress, LAZADA, have diverse demands for translation service which is developed by the translation technology team of Alibaba. In this talk, the speaker will explain the algorithm and engineering method of machine translation developed by Alibaba, as well as the best practice for the typical application scenarios of different BUs. Specifically, he will introduce the innovative methods for specific problems arising in e-commerce, explain the comprehensive solutions applied in product translation, search translation, IM, etc.. At last, he will introduce the main challenges and the future work.
12:00PM - 12:30PM
When AI meets security: How do we successfully leverage this synergy in Alibaba
Quan Lu (Alibaba Group, Security Department)
Artificial intelligence is the major technology that leads the innovation and changes almost every industry in the world. Cybersecurity is the key technology used in day-to-day business practices to protect systems, networks, data, and programs from digital attacks. In security department of Alibaba, we have seen the real revolution happening in the interaction between these two areas. In particular, we make smarter security & risk management system by leveraging new developed AI technologies, and we also protect AI system more effectively with advanced security concepts and tactics. In this talk, I will describe how we apply the latest developments in computer vision, natural language processing, and knowledge representation in bot traffic detection, online fraud identification, and intellectual property protection in e-commerce environment. I will also talk about our recent research on improving the robustness of AI system to attacks through self-supervised adversarial training.
12:30PM - 2:00PM
2:00PM - 3:00PM
Stable Learning: The Convergence of Causal Inference and Machine Learning
Peng Cui (Tsinghua University)
Predicting future outcome values based on their observed features using a model estimated on a training data set in a common machine learning problem. Many learning algorithms have been proposed and shown to be successful when the test data and training data come from the same distribution. However, the best-performing models for a given distribution of training data typically exploit subtle statistical relationships among features, making them potentially more prone to prediction error when applied to test data whose distribution differs from that in training data. How to develop learning models that are stable and robust to shifts in data is of paramount importance for both academic research and real applications. Causal inference, which refers to the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect, is a powerful statistical modeling tool for explanatory and stable learning. In this talk, we focus on causal inference and stable learning, aiming to explore causal knowledge from observational data to improve the interpretability and stability of machine learning algorithms.
3:00PM - 3:30PM
XDL+AI·OS: An End-to-End AI Platform in Alibaba for Search, Recommendation and Ads
Ralph Shen (Alibaba Group, Search and Alimama Department)
XDL is an open-sourced DL framework designed for high dimensional sparse data. AI·OS (AI Online Serving) is a big data serving platform with DL inference capabilities. XDL+AI·OS is being developed into an end-2-end AI experiment, training and serving platform.
3:30PM - 4:00PM
AliGraph: An Industrial Graph Neural Network Platform
Kun Zhao (Alibaba Group, AliCloud)
Graph Neural Network (GNN) has become an effective way to address the graph learning problem. To promote graph data into the framework of deep learning, new programming pattern and interfaces, large-scale heterogeneous graph data, and diversified neural network representation need to be resolved. We introduce AliGraph, a distributed computation system that bridges graph and neural network. It empowers end-to-end solutions for GNN researchers. As an independent and portable system, the interfaces of AliGraph can be integrated with any tensor engine that is used for expressing neural network models. By co-designing the flexible Gremlin-like interfaces for both graph query and sampling, users can customize data accessing pattern freely. Moreover, AliGraph shows excellent performance and scalability. It allows pluggable operators to adapt to the fast development of GNN community and outperforms existing systems an order of magnitude in terms of graph building and sampling. A heterogeneous and attributed graph with tens of billions of edges and billions of nodes can be well trained based on AliGraph.
4:00PM - 4:30PM
TDM: Learned Tree-based Index and Deep Retrieval Model for Large-scale Recommendation
Han Li (Alibaba Group, Alimama Department)
For the main Internet service providers like search engine or recommender system, the exponentially increasing data volume brings big challenges for accurate matching between users and contents. Large-scale recommender systems, for example, are usually confronted with computational problems due to the enormous corpus size. To retrieve and recommend the most relevant items to users under response time limits, resorting to efficient index structure is an effective and practical solution. In Alibaba display advertising team, we propose a framework to use the learned tree-based index along with powerful deep retrieval models to provide fast and accurate user interests matching from the enormous entire corpus. The remarkable progress makes it possible to perform one-stage recommendation from large corpus with advanced deep models without network structure limitation.
4:30PM - 5:00PM
A Large Scale Distributed Graph Learning System (Euler) and Its Applications in Alibaba’s Sponsored Search Platform
Liang Wang (Alibaba Group, Alimama Department)
Graph is an important data representation which has been extensively studied in the literature. However, existing methods are mostly deployed on small scale data containing less than millions of edges. In this talk, we will present our effort to apply graph learning methods to Alibaba’s sponsored search platform. The graph data contains billions of nodes representing the users, products (and advertisers) and search queries, and hundreds of billions of edge linking them together reflecting various relations (e.g. click/purchase behaviour, semantic similarity, relevance) between these nodes. We propose an efficient sampling method to make the training/inference procedure computationally feasible on such a large graph. And parallel deep neural network architecture with attention mechanism is leveraged to make fully use of the heterogeneous graph structure. By exploring the graph structural information, we largely improve the performance of our sponsored search system. Our work has been published as an open source project Euler available at https://github.com/alibaba/euler.
5:00PM - 6:00PM
AnalytiCup: Alibaba E-Commerce AI Challenge
Award Ceremony and Finalists Reports
Chaired by Prof. Fei Wang (Cornell University)
User Behavior Diversities Prediction
If embedding users and items into a heterogeneous graph, user-to-item behaviors could be treated as directed edges and item-to-item similarities could be encoded into the graph. Then recommendation task could be transformed as the graph link prediction problem. Such transition may bring a new view of recommendation system and further trigger novel algorithms to solve the bottleneck of behavior prediction problem.
- Join Now
Efficient User Interests Retrieval
In recent years, a lot of efforts have been made both in academy and industry to promote user preference prediction accuracy. However, large-scale recommender system has to trade-off between model effectiveness and efficiency as the response time limit. Our competition focuses on this problem, i.e., how to retrieve the top-k items for each user and avoid the exhausted calculation at the same time, which is very important and challenging.
- Join Now
- Jun 12, 2019 - Aug 13, 2019
- The Qualification
- Jul 5, 2019 - Aug 15, 2019
- The Semi-Finals
- Aug 22, 2019 - Sep 25, 2019
- The Finals
- Sep, 2019
- First Prize
- Track 1 (one team): $10,000 USD
- Track 2 (one team): $10,000 USD
- Second Prize
- Track 1 (two teams): $5,000 USD for each
- Track 2 (two teams): $5,000 USD for each
- Third Prize
- Track 1 (two teams): $2,500 USD for each
- Track 2 (two teams): $2,500 USD for each