ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

PhD Position in machine learning and information retrieval at Hubert Curien Laboratory (Saint-Etienne, France) and the Laboratoire d'Informatique de Grenoble (Grenoble, France)

Country/Region : France

Website : http://perso.univ-st-etienne.fr/me63854h

Description

A fully-funded 3 year Ph.D. position opens at the Hubert Curien Laboratory (Saint-Etienne, France) and the Laboratoire d'Informatique de Grenoble (Grenoble, France) with Pr. Massih-Reza Amini (http://ama.liglab.fr/~amini/) and Dr. Emilie Morvant (http://perso.univ-st-etienne.fr/me63854h/).
Ph.D. Title: Annotations transfer in a domain adaptation framework
Keywords: Machine Learning, Information Retrieval, Transfert Learning, Representation Learning
Starting date: September or October 2015
Application deadline: May, 18th 2015
Decision announcement date: June, 15th 2015
# Application:
The application should include; in one single pdf file:
- Letter of intent
- Grades and ranking during Master 1 and Master 2
- Scientific CV
- List of publications (if it exists of course)
- Referees
#Contact:
Emilie Morvant: emilie.morvant-AT-univ-st-etienne.fr
Massih-Reza Amini: massih-reza.amini-AT-imag.fr
# Profile
For this position, we are looking for highly motivated people, with a passion to work in machine learning and the skills to develop algorithms for prediction in real-life applications. We are looking for an inquisitive mind with the curiosity to use a new and challenging technology. The applicant must have a Master of Science in Computer Science, Statistics, or related fields, possibly with background in information retrieval and/or optimization. The working language in the lab is English, a good written and oral communication skills are required.
# Description
Nowadays, due to the expansion of the web a plenty of data are available and many applications need to make use of supervised machine learning methods able to take into account different information sources. However, such methods are based on the availability of annotated data that can be difficult and costly to obtain. The objective of this thesis is to tackle the issue of transferring annotations coming from different source datasets to a non-annotated target dataset: the goal is to learn a model for the target dataset thanks to the source annotations. This issue is known as domain adaptation, and one solution consists in (a) finding a common representation space for the source and target data; (b) learning a well-performing model in this space; (c) applying the model on new target data.
From a theoretical standpoint, the guarantees to learn a good model are usually not precise. This implies that one has nothing to validate the defined representation space and the learned model. The first objective of this thesis is to exploit the recent PAC-Bayesian domain adaptation framework to propose new theoretical analyses by taking into account (1) the representation space explicitly and (2) the dependences between the features of the considered data.
As practical applications of our new theory, this thesis will tackle domain adaptation for information retrieval tasks. A typical example corresponds to the problem of learning the parameters of models on an annotated dataset constituted by a set of documents and a set of queries with no relevance judgements. Rather than building relevance judgements for the new collection, we will exploit already annotated data to learn the best values of the parameters of the information retrieval models on the targeted dataset. This scenario is common in information retrieval, but also in other domains as text or image classification where new collections need be classified in existing taxonomies even though no annotation is available for these new collections.

Last modified: 2015-04-17 23:07:35