(Post)-doc seminars of Group STA on wednesday, January 16 2013

Sylvain Robiano and Charanpal Dhanjal will give two seminars on on wednesday, January 16 2013 in room DB312 between 10 AM and noon. Titles and abstracts follow.

First talk:

Title: Ranking Ordinal Data: Optimality and Pairwise Aggregation

Abstract: In this talk, we describe key insights in order to grasp the nature of K-partite ranking. From the theoretical side, the various characterizations of optimal elements are fully described, as well as the « likelihood ratio monotonicity » condition on the underlying distribution which guarantees that such elements do exist. Then, a pairwise aggregation procedure based on Kentall tau is introduced to relate learning rules dedicated to bipartite ranking and solutions of the K-partite ranking problem. Criteria reflecting ranking performance under these conditions such as the ROC surface and its natural summary, the volume under the ROC surface, are then considered as targets for empirical optimization. The consistency of pairwise aggregation strategies are studied under these criteria and shown to be efficient under reasonable assumptions. Eventually, numerical results illustrate the relevance of the methodology proposed.

Second Talk:

Title: An Empirical Comparison of V-fold Penalisation and Cross Validation for
Model Selection in Distribution-Free Regression

Abstract: Model selection is a crucial issue in machine-learning and a wide
variety of penalisation methods (with possibly data dependent complexity
penalties) have recently been introduced for this purpose. However their
empirical performance is generally not well documented in the
literature. It is the goal of this paper to investigate to which extent
such recent techniques can be successfully used for the tuning of both
the regularisation and kernel parameters in support vector regression
(SVR) and the complexity measure in regression trees (CART). This task
is traditionally solved via V-fold cross-validation (VFCV), which gives
efficient results for a reasonable computational cost. A disadvantage
however of VFCV is that the procedure is known to provide an
asymptotically suboptimal risk estimate as the number of examples tends
to infinity. Recently, a penalisation procedure called V-fold
penalisation has been proposed to improve on VFCV, supported by
theoretical arguments. Here we report on an extensive set of experiments
comparing V-fold penalisation and VFCV for SVR/CART calibration on
several benchmark datasets. We highlight cases in which VFCV and V-fold
penalisation provide poor estimates of the risk respectively and
introduce a modified penalisation technique to reduce the estimation error.

Comments are closed.