FCS files contain raw flow cytometry data (often referred to as "listmode
data") and can be accessed by most flow cytometry software. This
document describes some basic guidelines to use for annotating your
data at
the time of sample acquisition that will prevent confusion when you
or your colleagues wish to examine the data at some future date. This
document was written with the program CellQuest in mind (because that
was all that we had at the time), but I assume that the principles
will apply to other software packages as well.
The goal of FACS data annotation should be to include enough information
within the FCS file to allow someone else to make sense of the data
without requiring crossreferences to your notebook. Return
to Top
Filenames and Directories
The name of a FCS file can contain significant information that will
give someone an idea about what is in the file without even opening
it. Therefore, you are strongly urged to eschew default filenames such
as "Data.001", which are completely uninformative. Although filenames
often cannot provide a complete description of the contents of the
file, there are number of different systems that can be used for naming
files.
Filenames should include your initials. I
strongly urge you to develop a system in which the filenames for
your FACS data begin with you initials. This immediately indicates
who created the file (in fact, as far as I know, there is no parameter
in the standard FCS specification to store the "creator" of the
file; I consider this to be an oversight).
Here are a small number of instances where
you might consider breaking this rule: 1) you are collecting data
for a core laboratory; 2) you are collecting data for a large study
that has developed its own file naming conventions; 3) you have
other significant information that you'd like to indicate in the
file name, and you do not have enough characters available due
to limitations of the operating system.
Filenames that reference notebook pages.
You might consider establishing a system for naming of your FCS
files that include a reference to your laboratory notebook. For
example, you might use a name such as "JDA-IV-07-M3.001", which
refers to data collected for "mouse #3" of experiment 07 contained
in JDA's notebook IV.
Filenames that contain dates.
You might consider a filenaming system that includes a reference
to the date in the filename. For example, you might use a name such
as "JDA_010826", where the first two digits indicate the year (2001),
the second two indicate the month (08 indicates August), and the
last two indicate the day of the month (26). I strongly recommend
this format for the date because a computer will properly sort filenames
or directories in chronological order if they are named this way
(note that this won't happen if you put the year last, or if you
use alphabetic abbreviations for the month). I also recommend that
if you use filenames that contain dates, you also include in the
filename additional information that tells you something about the
experiment, distinguishing the data from data that you collected
on other dates.
Filenames that contain subject identifiers.
This scheme is particularly useful and highly recommended for longitudinal
studies. For example, we have carried out a study of T cell responses
in a number of rhesus macaques for over a year, and the filenames
always included the name of the monkey (e.g. "MH.000814.RLc5.001",
where the name of the monkey is "RLc5").
Filenames that contain "study names".
Suppose you are collecting data as part of a clinical trial for
the HVTN. Most of these trials are assigned a study number (e.g. HVTN203).
You might consider a file naming system that includes the study
name.
Filenames that contain
other information. Anything
significant is fair game for inclusion in a file name. You might
find it informative to include strain names (e.g. B6,
C3H, or Balb), or tissue source (e.g. PBMC, "WB" for "whole
blood", "spl" for spleen, etc.)
Remember the following additional guidelines.
Make sure that your filenames
are unique, even over time. For example, in a longitudinal study, do not give a data file a name
that carries only a subject identifier (e.g. "RLc5.001")
since you are likely to collect data files for that animal on another
date. While it is true that in most cases these files will be stored
in different directories, why risk inviting confusion?
Finally, directory names should also be informative
as well. While it is unlikely that your directory names will include
subject identifiers, tissue sources, or even strain names, they should
include your initials and an indication of either the date or a reference
to your notebook. Return to top.
Parameter Descriptions
The parameters descriptions that are entered in
the Parameter Description dialog
of CellQuestare stored in the FCS file and are available
to any program designed to process the data in those files. The more
careful you are in entering informative descriptions, the easier it
will be for collaborators, colleagues, advisors, or your future self
to use your data. The following guidelines for annotating parameters
should be followed.
Include the fluorophore name as well as
antigen. The name of the fluorophore
will then be carried through to the output. This is important because
it often allows people examining the data to determine whether
the data was well compensated, of if there were other problems.
To include or not to include "anti"? Your
reagents are usually antibodies against something
(e.g. CD4), but they are measuring the density of something
(e.g. CD4) on the
cell surface. For the most part, I'd recommend not including "anti" in
most circumstances.
On inclusion of clone names.
This is a bit of a conundrum. Obviously, inclusion of the antibody
clone names in the parameter descriptionÑe.g. "CD45RA
(UCHL-1) FITC"Ñprovides significant and sometimes crucial information.
I will never discourage this practice, but I recognize that it
often makes things unwieldy, and in most cases it is ok if the
clone names are omitted. However, if the properties of certain
clones are critical to the experiment (e.g. the
difference between the anti-mouse CD8 antibodies 53-6.7 and CT-CD8a),
then the clone name must be included in the parameter description.
On inclusion of titers.
In most cases, reagents should have been carefully titered before
hand and it should be safe to assume that the reagents have been
used at the optimal concentration (oh, if this were only the case!).
Therefore, in most cases it is permissible to omit the quantity
of the reagent used to stain the cells. However, when performing
the original titration experiments, the quantity of the reagent
used should be included in the parameter description (e.g. "CD45RA
(UCHL-1) 1:100 FITC").
Indirect stains. When
using indirect stains, it is probably best to include both the
primary and secondary reagent in the parameter description. For
example, "anti-Qa1b / GAM-PE" is preferable to the less accurate "anti-Qa1b-PE" (GAM
is a standard abbreviation for "goat anti-mouse").
Antigens defined by antibody names.
Perhaps the best known example of this is the antibody Ki-67, which
bound to an antigen that was not identified at the time the antibody
was isolated. Subsequently, the protein was identified, but is
still referred to as the Ki-67 protein.
MHC Tetramers.
These are most often referred to by including the name of the MHC
allele and the name of the peptide. It would be more correct to
including some indication that the reagent is a tetramer (e.g. "(A2/HIV-pol)4"),
but this is unwieldy and probably not necessary to make clear what
reagent was used.
Reagents which are ligands for a receptor.
A good example of this is the MIP-3b "chemotetramer".
Simply referring to this as "MIP-3beta PE" risks confusing it with
an antibody directed against MIP-3b.
One way to avoid the confusion would be to refer to the parameter
as "CCR7 (MIP-3beta) PE".
Stains specific for cytokines.
Naming of stains that are specific for cytokines produced in response
to specific stimulation present a unique set of "problems" that
merit an entire section of their own within this document. This
is included below. Return to top.
Proper naming of stains specific for cytokines
What is wrong with "IFNg FITC"? According
to the guidelines above, many investigators might find it appropriate
to simply include labels such as "IFNg FITC". While this is not
strictly incorrect, it is possible to include significant information
that would permit someone examining the data to better interpret
its meaning.
Include stimulus in parameter name. The
data contained in the FCS files will be much more informative if
you include the stimulus in the parameter name. For example "SEA
/ IFNg FITC" or "LCMV.GP33 / IFNg FITC" are much preferable to "IFNg
FITC".
Include concentrations
in parameter name. In all
cases, the responses that are you are measuring are dose-dependent,
whether it is explicitly acknowledged or not. Therefore, the best
names for cytokine parameters include the concentration of the
stimulus as well. For example "CM9 (10 µg/ml) / IFNg FITC" is the
prefered name.
What about names for CD69 or other activation
antigen? This is a bit of a problem. Obviously the comments in the preceding
paragraph would appear to apply to CD69 as well as IFNg. However,
there is often more non-specific activation of CD69, so I recommend
that you do not include antigen-specific information in the parameter
description for CD69.
Why not include time of stimulation as
well? While this carries significant
information, in most cases it is too unwield. However, if you are
doing a time course experiment, the time of stimulation should
be included in the parameter description (e.g. "CM9
(10 µg/ml; 2 hr) / IFNg FITC"). Return to top.
Patient ID
For the most part, I don't use these because I've
included most of the important information elsewhere. It may be the
case that I'm missing something and these can really be an important
part of optimal annotation of FCS files. Please let me know your thoughts
about this in the context of the comments in the document above. Return
to top.
Sample ID
For the most part, I don't use these because I've
included most of the important information elsewhere. It may be the
case that I'm missing something and these can really be an important
part of optimal annotation of FCS files. Please let me know your thoughts
about this in the context of the comments in the document above. Return
to top.