Affymetrix Power Tools or here
There are two kind of Affymetrix expression arrays:
See all Affymetrix expression arrays.
3' Gene Expression Arrays can be normalized only at gene level while Whole-Transcript Expression Arrays can be normalized at exon or gene level (the arrays called Gene Level can indeed be normalized at exon level)
In general Babelomics has to discover the type of chip the user is sending and to be able to classify it as 3' or Whole-Transcript. To do this we need a list of all array names Babelomics accepts and their classification.
Several "Library Files" are needed to normalize each kind of chip using APT. This is the complete list of Library Files provided by Affymetrix. EN GENERAL TENDREMOS QUE BAJARLOS CASI TODOS.
In general we will need...-a rmaor, when the dataset is big
-a rma-sketch
-a plier-gcbgor, when the dataset is big
-a plier-gcbg-sketchif the chip type is Whole-Transcript
-a plier-mmor, when the dataset is big
-a plier-mm-sketch
-a pm-mm,mas5-detect.calls=1.pairs=1if the chip type is Whole-Transcript
-a dabg
See APT complete documentation for more details.
The following sections explain how to carry on each of this three steps using APT.
The same code is used for all kind of arrays, either 3' or WT
Indicating a directory where to find the CEL files
apt-cel-convert -f text \
-o data_processed/txt_converted_cel_files \
data_raw/expression/*.CEL
Indicating a text file with the paths to the CEL files
apt-cel-convert -f text \
-o data_processed/txt_converted_cel_files \
--cel-files cell_paths_file.txt
Note:
--cel-files: file specifying cel files to process, one per line with the first line being 'cel_files'.
apt-cel-extract -o data_processed/raw_intensities_informed.txt \
-d data_raw/annotation/HG-U133A_2.cdf \
data_raw/expression/*.CEL
apt-cel-extract -o data_processed/raw_intensities_informed.txt \
-c data_raw/annotation/MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.clf \
-p data_raw/annotation/MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.pgf \
-b data_raw/annotation/MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.bgp \
data_raw/expression/*.CEL
apt-probeset-summarize -o data_processed/data_normalized/apt/ \
-d data_raw/annotation/HG-U133A_2.cdf \
-a pm-mm,mas5-detect.calls=1.pairs=1 \
-a rma \
-a rma-sketch \
-a plier-mm \
-a plier-mm-sketch \
data_raw/expression/*.CEL
apt-probeset-summarize -o data_processed/data_normalized/exon_level \
-c data_raw/annotation/MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.clf \
-p data_raw/annotation/MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.pgf \
-b data_raw/annotation/MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.bgp \
--qc-probesets data_raw/annotation/MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.qcc \
-a dabg \
-a rma \
-a rma-sketch \
-a plier-gcbg \
-a plier-gcbg-sketch \
data_raw/expression/*.CEL
Affymetrix Power Tools (APT) for exon expression... that is a different story.