                                  mwfilter



Wiki

   The master copies of EMBOSS documentation are available at
   http://emboss.open-bio.org/wiki/Appdocs on the EMBOSS Wiki.

   Please help by correcting and extending the Wiki pages.

Function

   Filter noisy data from molecular weights file

Description

   mwfilter removes unwanted (noisy) data from mass spectrometry output in
   proteomics. It reads an input file with a list of experimental
   molecular weights and writes the same data to an output file but with
   unwanted (noisy) data removed, at a specified ppm tolerance. Optionally
   the deleted molecular weights are shown. It removes those molecular
   weights which are:
     * Contaminating trypsin or keratin
     * Modified oxy-methionine or oxy-threonine
     * Peaks associated with sodium ions.

   The last two operations can be done as most peaks are reported in both
   modified and unmodified forms. Removal of modified peaks aids in
   database searching for protein identification.

Usage

   Here is a sample session with mwfilter


% mwfilter
Filter noisy data from molecular weights file
Molecular weights file: molwts.dat
Ppm tolerance [50.0]:
Molecular weights output file [molwts.mwfilter]:


   Go to the input files for this example
   Go to the output files for this example

Command line arguments

Filter noisy data from molecular weights file
Version: EMBOSS:6.4.0.0

   Standard (Mandatory) qualifiers:
  [-infile]            infile     Molecular weights file
   -tolerance          float      [50.0] Ppm tolerance (Any numeric value)
  [-outfile]           outfile    [*.mwfilter] Molecular weights output file

   Additional (Optional) qualifiers:
   -showdel            boolean    [N] Output deleted mwts

   Advanced (Unprompted) qualifiers:
   -datafile           datafile   [Emwfilter.dat] Molecular weight standards
                                  data file

   Associated qualifiers:

   "-outfile" associated qualifiers
   -odirectory2        string     Output directory

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options and exit. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages
   -version            boolean    Report version number and exit


Input file format

  Input files for usage example

  File: molwts.dat

874.364756
927.450380
1045.572
1068.397129
1121.431124
1163.584593
1305.660840
1428.596448
1479.748341
1502.549157
1554.591658
1567.686209
1576.708354
1639.868056
1748.611920
1753.745298
1880.841178
2383.99

   The input file is a simple list of the experimental molecular weights.
   There should be one weight per line.

   Comments in the data file start with a '#' character in the first
   column.
   Blank lines are ignored.

Output file format

   The output is a list of the molecular weights with the unwanted (noisy)
   data removed.

  Output files for usage example

  File: molwts.mwfilter

874.364756
927.450380
1068.397129
1121.431124
1163.584593
1305.660840
1428.596448
1479.748341
1502.549157
1554.591658
1567.686209
1576.708354
1639.868056
1748.611920
1753.745298
1880.841178

Data files

   The program reads the data file Emwfilter.dat for the molecular weights
   of items to be deleted from the experimental data.

   EMBOSS data files are distributed with the application and stored in
   the standard EMBOSS data directory, which is defined by the EMBOSS
   environment variable EMBOSS_DATA.

   To see the available EMBOSS data files, run:

% embossdata -showall

   To fetch one of the data files (for example 'Exxx.dat') into your
   current directory for you to inspect or modify, run:

% embossdata -fetch -file Exxx.dat


   Users can provide their own data files in their own directories.
   Project specific files can be put in the current directory, or for
   tidier directory listings in a subdirectory called ".embossdata". Files
   for all EMBOSS runs can be put in the user's home directory, or again
   in a subdirectory called ".embossdata".

   The directories are searched in the following order:
     * . (your current directory)
     * .embossdata (under your current directory)
     * ~/ (your home directory)
     * ~/.embossdata

Notes

   None.

References

   None.

Warnings

   None.

Diagnostic Error Messages

   None.

Exit status

   It always exits with status 0.

Known bugs

   None.

See also

   Program name     Description
   backtranambig    Back-translate a protein sequence to ambiguous nucleotide
                    sequence
   backtranseq      Back-translate a protein sequence to a nucleotide sequence
   compseq          Calculate the composition of unique words in sequences
   emowse           Search protein sequences by digest fragment molecular weight
   freak            Generate residue/base frequency table or plot
   mwcontam         Find weights common to multiple molecular weights files
   oddcomp          Identify proteins with specified sequence word composition
   pepdigest        Reports on protein proteolytic enzyme or reagent cleavage
                    sites
   pepinfo          Plot amino acid properties of a protein sequence in parallel
   pepstats         Calculates statistics of protein properties
   wordcount        Count and extract unique words in molecular sequence(s)

Author(s)

   Alan Bleasby
   European Bioinformatics Institute, Wellcome Trust Genome Campus,
   Hinxton, Cambridge CB10 1SD, UK

   Please report all bugs to the EMBOSS bug team
   (emboss-bug (c) emboss.open-bio.org) not to the original author.

History

   Written (Jan 2002) - Alan Bleasby.

Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.

Comments

   None
