Variable selection for data aggregated from different sources with group of variable structure

  1. Broc, Camilo Lucien
unter der Leitung von:
  1. Borja Calvo Molinos Doktorvater

Universität der Verteidigung: Universidad del País Vasco - Euskal Herriko Unibertsitatea

Fecha de defensa: 14 von November von 2019

Gericht:
  1. Christophe Ambroise Präsident/in
  2. Stéphane Robin Sekretär/in
  3. Borja Calvo Molinos Vocal
  4. Astrid Jourdan Vocal
  5. Hélène Jacquemin Gadda Vocal

Art: Dissertation

Teseo: 154306 DIALNET lock_openADDI editor

Zusammenfassung

During the last decades, the amount of available genetic data on populations has grown drastically. From one side, a refinement of chemical technologies have made possible the extraction of the human genome of individuals at an accessible cost. From the other side, consortia of institutions and laboratories around the world have permitted the collection of data on a variety of individuals and population. This amount of data raised hope on our ability to understand the deepest mechanisms involved in the functioning of our cells. Notably, genetic epidemiology is a field that studies the relation between the genetic features and the onset of a disease. Specific statistical methods have been necessary for those analyses, especially due to the dimensions of available data: in genetics, information is contained in a high number of variables compared to the number of observations. In this dissertation, two contributions are presented. The first project called PIGE (Pathway Interaction Gene Environment) deals with gene-environment interaction assessments. The second one aims at developing variable selection methods for data which has group structures in both the variables and the observations.