#### /rcdk/man/match.Rd

http://github.com/rajarshi/cdkr
Unknown | 87 lines | 81 code | 6 blank | 0 comment | 0 complexity | bb19dcd3141e52442c7669768c3c317e MD5 | raw file
 1\name{matches}
2\alias{match}
3\alias{matches}
4\alias{smarts}
5\alias{substructure}
6\alias{is.subgraph}
7\alias{get.mcs}
8\alias{mcs}
9\title{
10  Perform Substructure Searching & MCS Detection
11}
12\description{
13These functions perform substructure searches of a query, specified
14in SMILES or SMARTS forms, over one or more target molecules and maximum
15common substructure searches for pairs of molecules.
16}
17\usage{
18matches(query, target, return.matches=FALSE)
19is.subgraph(query, target)
20get.mcs(mol1, mol2, as.molecule = TRUE)
21}
22\arguments{
23  \item{query}{A SMILES or SMARTS string}
24  \item{target}{A single IAtomContainer object or a list of IAtomContainer objects}
25  \item{mol1}{An IAtomContainer}
26  \item{mol2}{An IAtomContainer}
27  \item{return.matches}{If \code{TRUE} the lists of atom indices that correspond to the matching substructure are returned}
28  \item{as.molecule}{If \code{TRUE} the MCS is returned as a new \code{IAtomContainer}
29  object. Otherwise a atom index maping between the two molecules is returned as a 2D
30  array of integers}
31}
32\details{
33For the case of \code{is.subgraph}, the query molecule must be a single
34\code{IAtomContainer} or a valid SMILES string. Note that this method can be
35significantly faster than \code{matches}, but is limited by the fact that SMARTS
36patterns cannot be specified. This uses the "TurboSubStructure" SMSD method and so
37only searches for the first substructure match.
38
39For MCS detection, the default SMSD algorithm is employed and the best scoring MCS is
40returned by default. Furthermore, one can obtain the resultant MCS either as an \code{
41IAtomContainer} in which the atoms and bonds are clones of the corresponding matching
42atoms and bonds in one of the molecule. Or else as a 2D array of dimensions Nx2 of
43atom index mappings. Here N is the size of the MCS and the first column represents the
44atom index from the first molecule and the second column the atom index from the second
45molecule.
46
47Note that since the CDK SMARTS matcher internally will perform aromaticity perception and
48atom typing, the target molecules need not have these operations done on them
49beforehand for \code{matches} method. However, if \code{is.subgraph} or \code{get.mcs}
50 is being used, the molecules should have aromaticity detected and atom typing performed
51explicitly.
52
53If the atom indices of the matching substructures (in the target molecule) are desired, use the
54\code{matches} function directly.
55}
56\examples{
57smiles <- c('CCC', 'c1ccccc1', 'C(C)(C=O)C(CCNC)C1CC1C(=O)')
58mols <- sapply(smiles, parse.smiles)
59query <- '[#6]=O'
60doesMatch <- matches(query, mols)
61
62## get mappings
63mappings <- matches("CCC", mols, TRUE)
64}
65\value{
66For \code{matches} with \code{return.matches = FALSE}, a boolean vector where each element is \code{TRUE} or \code{FALSE} depending on whether
67the corresponding element in targets contains the query or not. If \code{return.matches = TRUE}, the return value
68is a list of lists. The number of elements of the top level list equals the number of matches. Each element is a list of two elements, named
69"match" and "mapping". The first element is \code{TRUE} if the query matched the target. If so, the second element is a list of numeric
70vectors, giving the atom indices (0-indexed) of the target atoms that matched the query. If there was no match for this target molecule, this
71element will be \code{NULL}
72
73For \code{is.subgraph}, a boolean vector, where each element is TRUE or
74 FALSE depending on whether the corresponding element in targets contains the query or not.
75
76For \code{get.mcs} an \code{IAtomContainer} object or a 2D array of atom index mappings
77between the two molecules.
78}
79\keyword{programming}
80\seealso{