TUTORIALS

Enterprise Knowledge Graph From Specific Business Task to Enterprise Knowledge Management

Traditional Tutorial
Contact: Rong Duan
Room 301A, 8:30AM-12:00PM Nov 3

  • Presenters
      Rong Duan(Huawei Technology)
      Yanghua Xiao(Fudan University)
  • Description

    Data driven Knowledge Graph is rapidly adapted by different societies. Many open domain and specific domain knowledge graphs have been constructed, and many industries have benefited from knowledge graph. Currently, enterprise related knowledge graph is classified as specific domain, but the applications span from solving a narrow specific problem to Enterprise Knowledge Management system. With the digital transform of traditional industry, Enterprise knowledge becomes more and more complicated, it involves knowledge from common domain, multiple specific domains, and corporate-specific in general. This tutorial provides an overview of current Enterprise Knowledge Graph(EKG). It distinguishes the EKG from specific domain according to the knowledge it covers, and provides the examples to illustrate the difference between EKG and specific domain KG. The tutorial further summarizes EKG into three types: Specific Business Task Enterprise KG, Specific Business Unit Enterprise KG and Cross Business Unit Enterprise KG, and illustrates the characteristics, steps, challenges, and future research in constructing and consuming of each of these three types of EKG.

  • URL

TAMING SOCIAL BOTS: DETECTION, EXPLORATION AND MEASUREMENT

Traditional Tutorial
Contact: Abdullah Mueen
Room 301B, 8:30AM-12:00PM Nov 3

  • Presenters
      Abdullah Mueen (University of New Mexico)
      Nikan Chavoshi (Oracle Corporation)
      Amanda Minnich (Lawrence Livermore National Laboratories)
  • Description

    Social bots have been around for over a decade now. Social bots are capable of swaying political opinion, spreading false information, and recruiting for terrorist organizations. Social bots use various sophisticated techniques by adopting emotions, sympathy following, synchronous deletions, and profile molting.

    There are several approaches proposed in the literature for detection, exploration, and measuring social bots. We will provide a comprehensive overview of the existing work from data mining and machine learning perspective, discuss relative strengths and weaknesses of various methods, make recommendations for researchers and practitioners, and propose novel directions for future research in taming the social bots. The tutorial will also discuss pitfalls in collecting and sharing data on social bots.

  • URL

Learning and Reasoning on Graph for Recommendation

Traditional Tutorial
Contact: Wang Xiang
Room 302A, 8:30AM-12:00PM Nov 3

  • Presenters
      Xiang Wang (National University of Singapore)
      Xiangnan He (University of Science and Technology of China)
      Tat-Seng Chua (National University of Singapore)
  • Description

    Recommendation methods construct predictive models to estimate the likelihood of a user-item interaction. Previous models largely follow a general supervised learning paradigm — treating each interaction as a separate data instance and performing prediction based on the “information isolated island”. Such methods, however, overlook the relations among data instances, which may result in suboptimal performance especially for sparse scenarios. Moreover, the models built on a separate data instance only can hardly exhibit the reasons behind a recommendation, making the recommendation process opaque to understand.

    In this tutorial, we revisit the recommendation problem from the perspective of graph learning. Common data sources for recommendation can be organized into graphs, such as user-item interactions (bipartite graphs), social networks, item knowledge graphs (heterogeneous graphs), among others. Such a graph-based organization connects the isolated data instances, bringing benefits for exploiting high-order connectivities that encode meaningful patterns for collaborative filtering, content-based filtering, social influence modeling and knowledge-aware reasoning. Together with the recent success of graph neural networks (GNNs), graph-based models have exhibited the potential to be the technologies for nextgeneration recommendation systems. The tutorial provides a review on graph-based learning methods for recommendation, with special focus on recent developments of GNNs and knowledge graphenhanced recommendation. By introducing this emerging and promising area in the tutorial, we expect the audience can get deep understanding and accurate insight on the spaces, stimulate more ideas and discussions, and promote developments of technologies.

  • URL

Synergy of Database Techniques and Machine Learning Models for String Similarity Search and Join

Traditional Tutorial
Contact: Jiaheng Lu
Room 302B, 8:30AM-12:00PM Nov 3

  • Presenters
      Jiaheng Lu (University of Helsinki)
      Chunbin Lin (Amazon AWS)
      Jin Wang (University of California Los Angeles)
      Chen Li(University of California, Irvine)
  • Description

    String data is ubiquitous and string similarity search and join are critical to the applications of information retrieval, data integration, data cleaning, and also big data analytics. To support these operations, many techniques in the database and machine learning areas have been proposed independently. More precisely, in the database research area, there are techniques based on the filtering-and-verification framework that can not only achieve a high performance, but also provide guaranteed quality of results for given similarity functions. In the machine learning research area, string similarity processing is modeled as a problem of identifying similar text records; Specifically, the deep learning approaches use embedding techniques that map text to a low-dimensional continuous vector space.

    In this tutorial, we review a large number of studies of string similarity search and join in these two research areas. We divide the studies in each area into different categories. For each category, we provide a comprehensive review of the relevant works, and present the details of these solutions. We conclude this tutorial by pinpointing promising directions for future work to combine techniques in these two areas.

  • URL

Learning-Based Methods with Human-in-the-Loop for Entity Resolution

Traditional Tutorial
Contact: Kun Qian
Room 301A, 1:30PM-5:00PM, Nov 3

  • Presenters
      Sairam Gurajada (IBM Research)
      Lucian Popa (IBM Research)
      Kun Qian (IBM Research)
      Prithviraj Sen (IBM Research)
  • Description

    This tutorial is intended for researchers and practitioners working in the data integration area and, in particular, entity resolution (ER), which is the sub-area focused on linking entities across heterogeneous datasets. We outline the ideal requirements of modern ER systems: (1) capture domain knowledge via (minimal) human interaction, (2) provide as much automation as possible via machine learning techniques, and (3) achieve high explainability. We describe recent research trends towards bringing such ideal ER systems closer to reality. We first overview human-in-the-loop methods that are based on techniques such as crowdsourcing and active learning. We then dive into recent trends that involve deep learning techniques such as representation learning to automate feature engineering, and combinations of transfer learning and active learning to reduce the amount of required user labels. We also discuss how explainable AI is related to ER, and overview some recent advances towards explainable ER.

  • URL

Recent Developments of Deep Heterogeneous Information Network Analysis

Traditional Tutorial
Contact: Shi Chuan
Room 301B, 1:30PM-5:00PM, Nov 3

  • Presenters
      Chuan Shi(Beijing University of Posts and Telecommunications)
      Philip Yu(University of Illinois at Chicago)
  • Description

    Most real systems consist of a large number of interacting, multityped components, while most contemporary researches model them as homogeneous information networks, without distinguishing different types of objects and links in the networks. Recently, more and more researchers begin to consider these interconnected, multi-typed data as heterogeneous information networks (HIN), and develop structural analysis approaches by leveraging the rich semantic meaning of structural types of objects and links in the networks. Furthermore, recent advancement on deep learning and network embedding poses new opportunities and challenges to mining HIN, and heterogeneous network embedding, even heterogeneous graph neural network, is becoming a hot topic.

  • URL

Recommendation for Multi-Stakeholders and through Neural Review Mining

Hands-on Tutorial
Contact: Muthusamy Chelliah
Room 302A, 1:30PM-5:00PM, Nov 3

  • Presenters
      Muthusamy Chelliah (Flipkart, India)
      Yong Zheng (Illinois Institute of Technology, USA)
      Sudeshna Sarkar (IIT Kharagpur, India)
      Vishal Kakkar (Flipkart, India)
  • URL

Machine Learning on Graphs with Kernels

Hands-on Tutorial
Contact: Michalis Vazirgiannis
Room 302B, 1:30PM-5:00PM, Nov 3

  • Presenters
      G. Nikolentzos (LIX, École Polytechnique)
      M. Vazirgiannis (LIX, École Polytechnique)
      Giannis Siglidis (UPMC, France)

Realtime object detection via deep learning-based pipelines

Hands-on Tutorial
Contact: James G. Shanahan
Room 301B, 1:30PM-5:00PM, Nov 4

  • Presenters
      James G. Shanahan (Bryant University, RI; University of California, Berkeley)
      Liang Dai (Facebook, University of California Santa Cruz)
  • Description

    Ever wonder how the Tesla Autopilot works (or why it fails)? In this tutorial we will look under the hood for self-driving cars and for other applications of computer vision and review state-of-the-art tech pipelines and their implementations, and challenges in Jupyter Notebooks using Python, OpenCV, Keras, and Tensorflow.

    Computer vision (CV) has been revolutionized by deep learning in the past 7-8 years. Exciting real world deployments of computer vision are appearing in the cloud and on the edge. For example, autonomous vehicles, face detection, checkout-less shopping, security systems, cancer detection, and more.

    In this tutorial, we will briefly overview the basics of computer vision before focussing on object detection and other computer vision areas, from the following perspectives: state-of-the-art research, key algorithms, applications, and open challenges. We also present state-of-the-art pipelines that are being used in application areas, such as, advanced driver assistance systems (ADAS), driver monitoring systems (DMS), disease detection, such as lung cancer and heart disease, and security and surveillance systems. These pipelines are based on complex deep convolutional neural network (CNN) architectures (often 50-60 layers deep or more), multi-task loss functions, and are either two-stage (e.g., Faster R-CNN) or single-stage (e.g., YOLO/SSD) in nature. We will demonstrate in a Jupyter notebook how to build, train, and evaluate computer vision applications with a primary focus on building an object detection application from scratch to detect objects such as logos in images/video. Recent developments in object detection such as panoptic segmentation will also be reviewed.

  • URL