Task Sensitive Feature Exploration and Learning for Multi-Task Graph Classification: Source Code and Datasets

 

This page contains a technical report on "Task Sensitive Feature Exploration and Learning for Multi-Task Graph Classification", and the source codes and datasets used in the project.

Technical Report: Task Sensitive Feature Exploration and Learning for Multi-Task Graph Classification [pdf]

Abstract:

Multi-task learning (MTL) is commonly used for jointly optimizing multiple tasks for learning. To date, all existing MTL methods are designed for tasks with feature-vector represented instances, but cannot be applied to structured data, such as graphs. More importantly, when carrying out MTL, existing methods mainly focus on exploring overall commonality or disparity between tasks for learning, but cannot explicitly capture task relationships in the feature space, so they are unable to answer some fundamental questions, such as what exactly is shared between tasks and what is the uniqueness of one task differentiating from others?

In this paper, we formulate a new multi-task graph learning problem, and propose a task sensitive feature exploration and learning algorithm for multi-task graph classification. As graphs do not have features available, we advocate a task sensitive feature exploration and learning paradigm to jointly discover discriminative subgraph features across different tasks. In addition, a learning process in the feature space is carried out to categorize each subgraph feature into three categories: common feature, task auxiliary feature, and task specific feature, indicating whether the feature is shared by all tasks, by a subset of tasks, or by only one specific task, respectively. The learned features and multiple tasks are iteratively co-regularized to form a multi-task graph classification model with a global optimization goal. Experiments on real-world functional brain analysis and chemical compound categorization demonstrate the algorithm's performance. Results confirm that our method can be used to explicitly capture task correlations and uniqueness in the feature space, and explicitly answer what are shared between tasks and what is the uniqueness of a specific task.

Keywords: Graph Classification, Feature Selection, Multi-task Learning

Source Code: [zip]

The code is writen in Matlab and Java. The main function is writen in Matlab, while the subgraph mining part is writen in Java. The package contains nine NCI graph datasets, which are used for multi-task graph classification in the report.

Description:

FelMuG algorithm can be applied to generic multi-task classification for vector data as well as graph data. In general, FelMuG iteratives solves two subproblems:

          (1) Multi-kernel Learning Problem

          (2) Most-violated constraint selection

For the first subproblem, we employ the [MKMT solver] to solve the multi-kernel multi-task problem. For the second subproblem, FelMuG will solve a simple integer linear problem (ILP) to get the most-violated constraint, and employ a Top K subgraph miner in Java as a first step followed by solving a simple ILP problem in the second step.

Folders and Files:

         FelMuG/ : core scripts for FelMuG algorithm;

         MKL/ : MKMT solver for solving the multi-kernel problem;

      FelMuGMiner : Top-K subgrpah mining written in JAVA. It also provides source codes for subgraph based graph classification, i.e., first mine a set of frequent subgraphs, and then employ SVMs for graph classification;

         NCI_data/ : NCI data used in the report

Demos:

We provides two examples for using FelMuG for multi-task classification, in both generic vector data and graph data.

        (1) demo_FelMuG_vector.m, FelMuG algorithm for generic vector data with synthetic data, which is used in our experiment

        (2) demo_FelMuG_graph.m, FelMuG algorithm for multi-task graph classification in NCI graphs, which are also used in our experiment

Last updated: March 2015