PageRenderTime 28ms CodeModel.GetById 25ms app.highlight 1ms RepoModel.GetById 1ms app.codeStats 0ms

/rcdk/man/loadmol.Rd

http://github.com/rajarshi/cdkr
Unknown | 77 lines | 72 code | 5 blank | 0 comment | 0 complexity | 514c875ae21e574d8ea3178938342622 MD5 | raw file
 1\name{load.molecules}
 2\alias{load.molecules}
 3\alias{iload.molecules}
 4\title{
 5  Load Molecular Structures From Disk
 6}
 7\description{
 8The CDK can read a variety of molecular structure formats. This function
 9encapsulates the calls to the CDK API to load a structure given its filename
10}
11\usage{
12load.molecules(molfiles=NA, aromaticity = TRUE, typing = TRUE, isotopes = TRUE,
13               verbose=FALSE)
14iload.molecules(molfile, type="smi", aromaticity = TRUE, typing = TRUE, isotopes = TRUE,
15                skip=TRUE)
16}
17\arguments{
18  \item{molfiles}{A \code{character} vector of filenames. Note that the full
19  path to the files should be provided. URL's can also be used as
20  paths. In such a case, the URL should start with "http://"}
21  \item{molfile}{A string containing the filename to load. Must be a local file}
22  \item{type}{Indicates whether the input file is SMILES or SDF. Valid values are
23  "smi" or "sdf"}
24  \item{aromaticity}{If \code{TRUE} then aromaticity detection is
25  performed on all loaded molecules. If this fails for a given
26  molecule, then the molecule is set to NA in the return list}
27  \item{typing}{If \code{TRUE} then atom typing is
28  performed on all loaded molecules. The assigned types will be CDK
29  internal types. If this fails for a given
30  molecule, then the molecule is set to NA in the return list}
31  \item{isotopes}{If \code{TRUE} then atoms are configured with isotopic masses}
32  \item{verbose}{If TRUE, output (such as file download progress) will
33  be bountiful}
34  \item{skip}{If \code{TRUE}, then the reader will continue reading even when faced with an 
35  invalid molecule. If \code{FALSE}, the reader will stop at the fist invalid molecule}
36}
37\value{
38  \code{load.molecules} returns a list of CDK \code{Molecule} objects, which can be 
39  used in other rcdk functions. 
40
41  \code{iload.molecules} is an iterating version of the loader and is applicable for
42  large SMILES or SDF files. In contrast to \code{load.molecules} this does not load
43  all the molecules into memory at one go, and as a result lets you process arbitrarily
44  large structure files. 			 
45}
46\details{
47Note that if molecules are read in from formats that do not have rules for
48handling implicit hydrogens (such as MDL MOL), the molecule will not have
49implicit or explicit hydrogens. To add explicit hydrogens, make sure that the molecule
50has been typed (this is \code{TRUE} by default for this function) and then call 
51\code{\link{convert.implicit.to.explicit}}. On the other hand for a format 
52such as SMILES, implicit or explicit hydrogens will be present.
53}
54\examples{
55\dontrun{
56
57## load a single file
58amol <- load.molecules('foo.sdf')
59
60## load multiple files
61mols <- load.molecules(c('mol1.sdf', 'mol2.smi', 
62          'https://github.com/rajarshi/cdkr/blob/master/data/set2/dhfr00008.sdf?raw=true'))
63
64## iterate over a large file
65moliter <- iload.molecules("big.sdf", type="sdf")
66while(hasNext(moliter)) {
67  mol <- nextElem(moliter)
68  print(get.property(mol, "cdk:Title"))
69}
70}
71}
72\seealso{
73  \code{\link{view.molecule.2d}}, \code{\link{convert.implicit.to.explicit}}
74}
75\keyword{programming}
76
77\author{Rajarshi Guha (\email{rajarshi.guha@gmail.com})}