ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Postdoctoral position (Strasbourg): NLP, ML, KE for named entity extraction in databases

Country/Region : France

Website : http://www.unistra.fr

Description

The BFO team of ICube proposes a post-doctoral position in Strasbourg (France) for 12 months starting in January 2015 within the NERD (Named Entities in Relational Databases) research project.
Study of techniques coming from natural language processing, machine learning and knowledge engineering for the extraction of named entities in sparse texts in databases
?Project Description
The overall goal of this research is to derive knowledge from unstructured and, more importantly, unlabelled data in an unsupervised manner. This, for example, will allow the content of a table in a database to be categorised, which will allow higher-level processes to be developed which can use this knowledge. Many enterprise level databases are very large, and deriving such knowledge through manual means would be prohibitively expensive and the information needed would be difficult to specify.
First we define the key difference between natural language documents and a typical enterprise database. A webpage, or book, is generally characterised by paragraphs of text, each containing natural language that follows specific topics. In contrast to this, databases often contain fragmented information such as addresses, descriptions, colours, labels, telephone numbers, etc. As such, they contain information that is not accompanied by the additional context that would be present within a natural language document and would be exploited by current state-of-the-art algorithms (for example the contextual information that would be derived from a sentence's structure or surrounding words). Of course, databases may also contain natural language documents, but this is the exception (in commercial databases) rather than the norm. Nevertheless, the developed algorithms should be applicable to the spectrum of documents that range between sparse (traditional business) databases and dense (natural language) databases. Experiments on real world databases is expected and access to an extensive corpus of data will be provided.
Expected Outcomes:
Proposal of a new method for identifying named entities in sparse text. This method will be validated through publication in a high level international conference.
Final report including a technical description of the method and results of experimental validation.
The partners of the project are:
the BFO group of ICube, specialized in data mining and knowledge engineering;
the FDT group of LiLPa, specialized in natural language processing;
the Laboratoire Quantup, experienced in the application of machine learning and pattern recognition methods to commercial database systems.
Candidates applying for this position should have a PhD in Computer Science with a good background in Natural Language Processing, Machine Learning and formal Knowledge Representation. Experience in Java programming and distributed systems (for example using the Map-Reduce framework and/or Spark) is desired. A good knowledge of English is required along with an intermediate level of French.
Candidates should send an academic curriculum vitae (including a list of publications and the names and contact details of two referees), along with a cover letter.
Deadline for application: 31st October 2014
Expected starting date: January 2015 (flexible)
Contact: Delphine Bernhard (dbernhard-AT-unistra.fr) and Tom Lampert (t.lampert-AT-laboquantup.eu)
Salary: 2200 ? per month (net, not including income tax).
Location
The University of Strasbourg traces its roots back to 1538 and is the second largest university in France. It is amongst Europe's best in the League of European Research, it consistently features on world university rankings, and is well known for its international level research output. The ICube research group brings together researchers of the University of Strasbourg, the CNRS (Centre National de la Recherche Scientifique), the ENGEES and the INSA of Strasbourg in the fields of engineering science, computer science and medical science. With around 500 members and 14 research groups, ICube is a major driving force for research in Strasbourg. The work will take place at ICube’s offices in Illkirch, approximately fifteen minutes by public transport from the centre of Strasbourg -- a historic university city, well connected to Paris, Switzerland and Germany.

Last modified: 2014-09-21 22:33:20