IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Feeback sought / Pregap updates

Bonfield James jkb at mrc-lmb.cam.ac.uk
Fri Feb 17 06:12:27 EST 1995


	We are seeking peoples comments on the Pregap program distributed with
recent versions of the Staden Package. We are particularly interested in
uses of the pregaprc files to configure the system for use with any
"home-brew" software and techniques, such as external databases of reading and
template information.

	On a connected note, please remember that some of the programs in the
package (eg vepe and gap) will not work with spaces in filenames. See below
for more information on how this affects Pregap.

	Also, we've noticed a few potential problems which may have been
causing people problems with Pregap. Here follows lengthy descriptions of
Pregap and possible problems. The easiest solution to fixing all of these is
to obtain new versions by anonymous ftp to al.mrc-lmb.cam.ac.uk. See the
directory pub/staden/pregap for a new system-pregaprc file and the file
pub/staden/MACHINE/init_exp (where MACHINE is one of "alpha", "sgi",
"solaris" or "sun") for the init_exp program.

The first two are major: they prevent pregap from working at all.

	The most obvious is the missing init_exp program from the first gap
release. A compiled version can be obtained by anonymous ftp from the
location described above.

	Also on these earlier tapes there was a missing definition in the
$STADENROOT/pregap/system-pregaprc file. Check that the line:
"getABISampleName=getABISampleName" is within the file, and add it if not.

	The next two problems are minor and have been reported by people using
ABI 373A machines. These probably won't cause problems for most people, but
are described just in case.

	Firstly, the sample name should not contain spaces. It is allowed for
the raw trace/sample data to contain spaces (as is the default for the ABI
"Sample ??" files), but not for the experiment filenames. The reason is that
vepe and gap (and xgap) only process the first 'word' from each line when
inputing a file of filenames. This is a desired feature as it allows comments
to be included in files, and allows the output from gap functions (such as
"show relationships") to be reused as input.

	The fix is simply to disallow spaces in the filenames. This can be
easily achieved by using sed to replace spaces with underscores. So for
example the abi_SCF_com and abi_Exp_com definitions in the system-pregaprc
file could be extended as follows:

abi_SCF_com='echo `${getABISampleName} "${file}"`.scf | sed "s/ /_/g"'
abi_Exp_com='${getABISampleName} "${file}" | sed "s/ /_/g"'

	The final problem arises when you have no sample names at all. For ABI
data, as far as I'm aware, this should only arise when the default names have
been deliberately removed. Hence it's unlikely that this problem will crop up.
However, it is possible to make Pregap more robust to this by redefining the
abi_SCF_com and abi_Exp_com files further:

find_abi_name() {
    n=`${getABISampleName} "$1"`
    if [ -z "$n" ]
       echo "s_$1"
        echo "${n}"

abi_SCF_com='echo `find_abi_name "${file}"`.scf | sed "s/ /_/g"'
abi_Exp_com='find_abi_name "${file}" | sed "s/ /_/g"'

	This simply generates sample names of "s_INPUTNAME" where INPUTNAME is
the filename of the raw trace data. This will only be done when the real
sample name cannot be derived.


James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 0223 402266   Fax: 0223 412282
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.

More information about the Staden mailing list

Send comments to us at biosci-help [At] net.bio.net