The goal of this challenge is to predict ADRs using gene expression data. The gene expression data was generated using the L1000 technology at the Broad Institute by the Connectivity Map Team. 978 genes were directly measured in the human cancer cell lines before and after drug treatment. The expression data that is provided for this mega-challenge consists of gene expression signatures of drug treatments calculated using the Characteristic Direction method. This is data should be considered the feature set or the attributes. The class matrix contains ADRs for the corresponding drugs that were profiled for gene expression. ADRs have been extracted from the SIDER database and are coded into their High Level Grouping Terms (HLGT) using MedDRA.


