########################################################################################################
# The Broad Institute
# SOFTWARE COPYRIGHT NOTICE AGREEMENT
# This software and its documentation are copyright 2007 by the
# Broad Institute/Massachusetts Institute of Technology.
# All rights are reserved.
#
# This software is supplied without any warranty or guaranteed support
# whatsoever. Neither the Broad Institute nor MIT can be responsible for
# its use, misuse, or functionality.
########################################################################################################
#
# An erythroid differentiation signature predicts response to lenalidomide in Myelodysplastic Syndrome
# Benjamin L. Ebert, Naomi Galili, Pablo Tamayo, Jocelyn Bosco, Raymond Mak, Jennifer Pretz,
# Christine Ladd-Acosta, Richard Stone.
#
# Author: Pablo Tamayo  -  April 12, 2007 tamayo@broad.mit.edu
#
# This R script below implements the regression model described in the paper to
# predict response lenalidomide in patients with Myelodysplastic Syndrome.
#
# Additional auxiliary functions are included in file: Rev.msig.library.1.R
#
# The progam uses a control gene signature defined on 5 housekeeping genes to normaize the expression
# data. It also performs colum-rank normalization and row-standardization to make the model more robust and
# less dependent on platform idyosincracies. The script computes a score based on summarizing
# genes in a signature.
# This score is used as input to a probit linear regression model that is trained on the train set and
# applied to the test set.
#
# If you want to run this program make sure you change the file pathnames to the appropriate location
# in your computer. Make also sure you have the input datasets in the right locations and that the
# parameters are set properly according to your purposes.  Cut and paste the script into the
# R GUI to run the program. With the default settings you should be able to reprooduce the heatmaps and
# prediction results reported in the paper and shown in figure 4. The program is set up to run the
# Affymetrix files but it can be easily change to run the Luminex of PCR datasets bu just uncommenting the
# corresponding file pathnames below. The program uses by default a classic fit to a
# probit regression model using R's glm function but it can also use a Bayesiam fit using the MCMCpack package.
#
# A more general version of this program will be made available as part of a forthcoming publication.
#
# This program comes in a ZIP file with the following dataset:
#
# Revlimid_Affy_train_5qm.gct  Affymetrix train set 5q- samples
# Revlimid_Affy_train_5qm.cls
#
# Revlimid_Affy_train_non_5qm.gct  Affymetrix train set non-5q- samples
# Revlimid_Affy_train_non_5qm.cls
#
# Revlimid_Affy_test_5qm.gct  Affymetrix test set 5q- samples
# Revlimid_Affy_test_5qm.cls  
#
# Revlimid_Affy_test_non_5qm.gct  Affymetrix test set non-5q- samples
# Revlimid_Affy_test_non_5qm.cls
#
# Revlimid_Affy_all.gct  Affymetrix all samples
# Revlimid_Affy_all.cls
#
# Revlimid_Affy_all_5qm.gct  Affymetrix all 5q- samples
# Revlimid_Affy_all_5qm.cls
#
# Revlimid_Affy_all_non_5qm.gct  Affymetrix all non-5q- samples#
# Revlimid_Affy_all_non_5qm.cls
#
# Revlimid_Luminex_train_non_5qm.gct  Luminex train set non-5q- samples
# Revlimid_Luminex_train_non_5qm.cls
#
# Revlimid_Luminex_test_non_5qm.gct  Luminex test set non-5q- samples
# Revlimid_Luminex_test_non_5qm.cls
#
# Revlimid_qPCR_train_non_5qm.gct   qPCR train set non-5q- samples
# Revlimid_qPCR_train_non_5qm.cls
#
# signatures.gmt  file containing the erythroid anf control signatures
#
# Upon completion the program produces the following output files:
#
# Revlimid.Affy.paper.plot.1.train.set.jpeg  heatmap for signature in the train set 
# Revlimid.Affy.paper.plot.2.train.set.jpeg  prediction scores in the train set 
# Revlimid.Affy.paper.plot.2.test.set.jpeg  heatmap for signature in the test set 
# Revlimid.Affy.paper.plot.1.test.set.jpeg  prediction scores in the test set 
#
# Other additional files contain the regression model scores and the predictive probabilities
#
# Revlimid.Affy.train.set.zscore.jpeg  
# Revlimid.Affy.train.set.zscore.gct
# Revlimid.Affy.train.set.probs.gct 
# Revlimid.Affy.train.set.post.prob.jpeg
# Revlimid.Affy.test.set.zscore.jpeg
# Revlimid.Affy.test.set.zscore.gct
# Revlimid.Affy.test.set.probs.gct
# Revlimid.Affy.test.set.post.prob.jpeg
