Case Studies (GEO Datasets) from GEO

Here we make use of the Gene Expression Omnibus (GEO) to provide example transcriptomics datasets for analysis. Each of the entries in the table below is a GEO Dataset - a curated Gene Expression study. Here are some tasks to familiarize yourself with respect to basis analyses of this type of data:

  • Select a GEO dataset
  • Each CSV file is a matrix (gene x sample) of gene-expression data.
  • From the sample descriptions, and the experimental design described on the GEO dataset web-page, select a group of samples to perform differential gene-expression analysis (for example this could be comparing control vs treated samples). Note that this may be a subset (or all of the samples in the file)
  • For each gene, calculate a fold-change (ratio) ratio between your sample groups (e.g. treated/control) and express this as logarithm base 2
  • For each gene, you could also calculate a p-value (e.g. T-test) to test whether your sample groups are statistically significantly different
  • Sort the data according to the calculated ratio (and/or p-value), and identify those genes where log2 ratio > 1 or log 2 ratio < -1 or p < 0.05
  • For the differential genes, use gene set enrichment and/or functonal network analysis to uncover potential pathways or processes that are represented.
  • Study Title GEO Link Data
    PTEN deletion mutation effect on colon cancer cells GDS2446 mat.GDS2446.csv
    Colorectal adenoma formation GDS2947 mat.GDS2947.csv
    Beta-catenin inactivation effect on intestinal crypts GDS2984 mat.GDS2984.csv
    Beta-catenin deficiency model: kidney GDS3322 mat.GDS3322.csv
    Colon carcinoma response to butyrate and aspirin GDS332 mat.GDS332.csv
    Beta-catenin depletion effect on pancreatic cancer cell line GDS3578 mat.GDS3578.csv
    Colorectal cancer tumors GDS4382 mat.GDS4382.csv
    Colorectal cancer tumors GDS4379 mat.GDS4379.csv
    Ls174T colon cancer cell line response to Wnt signaling inhibition GDS4386 mat.GDS4386.csv
    5-aza-2-deoxycytidine effect on colorectal cancer cell lines GDS4397 mat.GDS4397.csv
    Model of -catenin overexpression in embryonic kidney GDS4449 mat.GDS4449.csv
    Sporadic medulloblastomas GDS4469 mat.GDS4469.csv
    Medulloblastomas in children GDS4471 mat.GDS4471.csv
    MYC-driven medulloblastoma model GDS4478 mat.GDS4478.csv
    Colorectal cancer cell line SW480 response to Snail overexpression GDS4596 mat.GDS4596.csv
    Central nervous system primitive neuroectodermal tumors GDS4838 mat.GDS4838.csv
    Beta-catenin depletion effect on pancreatic cancer cell line GDS5324 mat.GDS5324.csv