PageRenderTime 16ms CodeModel.GetById 13ms app.highlight 1ms RepoModel.GetById 1ms app.codeStats 0ms

/fingerprint/man/linefunc.Rd

http://github.com/rajarshi/cdkr
Unknown | 54 lines | 50 code | 4 blank | 0 comment | 0 complexity | f195b3813a235780f5de974030e1635d MD5 | raw file
 1\name{cdk.lf, moe.lf, bci.lf}
 2\alias{cdk.lf}
 3\alias{moe.lf}
 4\alias{bci.lf}
 5\alias{ecfp.lf}
 6\alias{fps.lf}
 7\alias{jchem.binary.lf}
 8\title{
 9    Functions to parse lines from fingerprint files
10}
11\description{
12These functions take a single line and parses it to produce 
13a vector of integers which represents the position of the 'on' bits in
14a fingerprint. This allows the user to use \code{read.fp} with arbitrary fingerprint
15files. A new file format can be handled by defining a new line parser function.
16Currently the first three functions process fingerprint files obtained from the 
17CDK (\url{http://cdk.sourceforge.net}), MOE (\url{http://chemcomp.com}), BCI 
18(\url{http://www.digitalchemistry.co.uk/}) and the FPS format 
19(\url{http://code.google.com/p/chem-fingerprints/wiki/FPS}). The last function can be used
20for any fingerprint that generates hashed features (such as ECFPs or other 
21circular fingerprints). For these cases, it is assumed that features are unsigned 
22integers, so string features are not handled.
23
24Note that when the \code{fps.lf} function is specified, items such as the number of bits
25or the header flag do not need to be specified, as the format requires a header block 
26containing some of these items.
27}
28\usage{
29    cdk.lf(line)
30    moe.lf(line)
31    bci.lf(line)
32    ecfp.lf(line)
33    fps.lf(line)
34    jchem.binary.lf(line)
35}
36\arguments{
37    \item{line}{
38        The line to parse
39    }
40}
41\value{
42A list with three componenents - the name associated with the fingerprint (if available)
43 and a vector of integers representing bits set to 1 (for the case of the first three 
44methods) or a vector of characters representing hashed features (characteristic of
45 circular fingerprints) or more generally, any string feature. The third component is a
46(possibly empty) list, which contains the remaining components of a line, when the format
47allows items other than an a title and the fingerprint (such as the FPS format). The content
48of the third component is dependent on the line function that is being used.
49}
50\author{Rajarshi Guha \email{rajarshi.guha@gmail.com}}
51\keyword{logic}
52
53
54