Gene Expression Profiling and Supervised Machine Learning," Nature, January 2002. This document contains summary information regarding the files contained on this supplemental information website. Last updated on 12/18/01 ---------------------------------------------------------------------- Files: Shipp_et_al_Supplementary_Information_v5.doc - a Microsoft Word document containing supporting information and details about methods for results presented in the paper. Shipp_et_al_Supplementary_Information_v5.pdf - a pdf formated document containing supporting information and details about methods for results presented in the paper. lymphoma_8_lbc_fscc2_rn.res - Diffuse Large B-Cell Lymphoma (DLBCL) versus Follicular Lymphoma (FL) morphology res file. This file contains expression values in Affymetrix's scaled average difference units (as described in the supplemental information document) for all of the DLBCL and FL samples used in this study. These average difference values were generated by Affymetrix's GeneChip software. Associated with each average difference expression number is a P, M, or A label that indicates whether RNA for the gene is present, marginal, or absent, respectively (as determined by the GeneChip software) based upon the matched and mismatched probes for the genes. The file is organized such that columns contain data for samples and rows contain data for genes. lymphoma_8_lbc_fscc2.cls - Diffuse Large B-Cell Lymphoma (DLBCL) versus Follicular Lymphoma (FL) morphology class file. This file contains the class designations in the format used by our GeneCluster software where a 1 indicates a DLBCL sample while a 0 indicates a FL sample. lymphoma_8_lbc_outcome_rn.res - Diffuse Large B-Cell Lymphoma (DLBCL) outcome prediction res file. This file contains expression values in Affymetrix's scaled average difference units (as described in the supplemental information document) for all of the DLBCL samples used for outcome prediction in this study. These average difference values were generated by Affymetrix's GeneChip software. Associated with each average difference expression number is a P, M, or A label that indicates whether RNA for the gene is present, marginal, or absent, respectively (as determined by the GeneChip software) based upon the matched and mismatched probes for the genes. The file is organized such that columns contain data for samples and rows contain data for genes. lymphoma_8_lbc_outcome.cls - Diffuse Large B-Cell Lymphoma (DLBCL) outcome prediction class file. This file contains the class designations in the format used by our GeneCluster software where a 0 indicates a DLBCL sample from a cured patient while a 1 indicates a DLBCL sample from a patient with fatal or refractory disease. lymphoma_clinical_011127.xls - A Microsoft Excel worksheet containing clinical information for the Lymphoma samples used in the study. This table cross references the sample identifier with the full International Prognosis Index (Full IPI), the patient's survival time in moths from diagnosis to the latest follow-up (SURTIME), the patient's current (at last follow-up) disease status (STATUS), and the outcome class (OUTCOME). The definitions for these data types are described in detail in the supplemental information document. lymphoma_common_unigene.xls - A Microsoft Excel worksheet containing the details from the mapping between the Alizadeh et al cell-of-origin markers and the Affymetrix HU6800 markers. This worksheet has three sheets: 1) a sheet that gives all Unigene tags that map from both the HU6800 set of genes and the genes in the Alizadeh et al. in the GC B-like versus the activated PB-like distinction, 2) a sheet with the Lymphochip clone numbers for those Unigene tags in both sets, and 3) a sheet with HU6800 probe ids for those Unigene tags in both sets. Lymph_LBC_1-29.CEL.tar.gz - A gzipped tar file containing 29 CEL files with raw data from the Affymetrix GeneChip software for DLBCL samples DLBC1 through DLBC29. These files contain the measurements for each of the probes on the Affymetrix HU6800 microarrays that are used to calculate the average difference values. Lymph_LBC_30-58.CEL.tar.gz - A gzipped tar file containing 29 CEL files with raw data from the Affymetrix GeneChip software for DLBCL samples DLBC30 through DLBC58. These files contain the measurements for each of the probes on the Affymetrix HU6800 microarrays that are used to calculate the average difference values. Lymph_FSCC_1-19.CEL.tar.gz - A gzipped tar file containing 19 CEL files with raw data from the Affymetrix GeneChip software for Follicular Lymphoma samples FSCC1 through FSCC19. These files contain the measurements for each of the probes on the Affymetrix HU6800 microarrays that are used to calculate the average difference values. Lymphoma_Shipp_et_al_Fig5.xls - A Microsoft Excel worksheet containing expanded versions of Figure 5a and 5b from the supplemental information document and the paper. The supplemental information contains expansions of the paper figures with all of the gene and sample labels and corresponding descriptions. These versions are more easily readable than those contained in the supplemental information. ---------------------------------------------------------------------- Notes: - The raw data (CEL files) for the scans are contained in compressed archive files that have been processed by Unix commands tar and gzip. The raw data files can be extracted using WinZip on the PC (http://www.winzip.com) and StuffIt Expander on the Mac (http://www.stuffit.com/expander/).