/rcdk/man/loadmol.Rd
Unknown | 77 lines | 72 code | 5 blank | 0 comment | 0 complexity | 514c875ae21e574d8ea3178938342622 MD5 | raw file
1\name{load.molecules} 2\alias{load.molecules} 3\alias{iload.molecules} 4\title{ 5 Load Molecular Structures From Disk 6} 7\description{ 8The CDK can read a variety of molecular structure formats. This function 9encapsulates the calls to the CDK API to load a structure given its filename 10} 11\usage{ 12load.molecules(molfiles=NA, aromaticity = TRUE, typing = TRUE, isotopes = TRUE, 13 verbose=FALSE) 14iload.molecules(molfile, type="smi", aromaticity = TRUE, typing = TRUE, isotopes = TRUE, 15 skip=TRUE) 16} 17\arguments{ 18 \item{molfiles}{A \code{character} vector of filenames. Note that the full 19 path to the files should be provided. URL's can also be used as 20 paths. In such a case, the URL should start with "http://"} 21 \item{molfile}{A string containing the filename to load. Must be a local file} 22 \item{type}{Indicates whether the input file is SMILES or SDF. Valid values are 23 "smi" or "sdf"} 24 \item{aromaticity}{If \code{TRUE} then aromaticity detection is 25 performed on all loaded molecules. If this fails for a given 26 molecule, then the molecule is set to NA in the return list} 27 \item{typing}{If \code{TRUE} then atom typing is 28 performed on all loaded molecules. The assigned types will be CDK 29 internal types. If this fails for a given 30 molecule, then the molecule is set to NA in the return list} 31 \item{isotopes}{If \code{TRUE} then atoms are configured with isotopic masses} 32 \item{verbose}{If TRUE, output (such as file download progress) will 33 be bountiful} 34 \item{skip}{If \code{TRUE}, then the reader will continue reading even when faced with an 35 invalid molecule. If \code{FALSE}, the reader will stop at the fist invalid molecule} 36} 37\value{ 38 \code{load.molecules} returns a list of CDK \code{Molecule} objects, which can be 39 used in other rcdk functions. 40 41 \code{iload.molecules} is an iterating version of the loader and is applicable for 42 large SMILES or SDF files. In contrast to \code{load.molecules} this does not load 43 all the molecules into memory at one go, and as a result lets you process arbitrarily 44 large structure files. 45} 46\details{ 47Note that if molecules are read in from formats that do not have rules for 48handling implicit hydrogens (such as MDL MOL), the molecule will not have 49implicit or explicit hydrogens. To add explicit hydrogens, make sure that the molecule 50has been typed (this is \code{TRUE} by default for this function) and then call 51\code{\link{convert.implicit.to.explicit}}. On the other hand for a format 52such as SMILES, implicit or explicit hydrogens will be present. 53} 54\examples{ 55\dontrun{ 56 57## load a single file 58amol <- load.molecules('foo.sdf') 59 60## load multiple files 61mols <- load.molecules(c('mol1.sdf', 'mol2.smi', 62 'https://github.com/rajarshi/cdkr/blob/master/data/set2/dhfr00008.sdf?raw=true')) 63 64## iterate over a large file 65moliter <- iload.molecules("big.sdf", type="sdf") 66while(hasNext(moliter)) { 67 mol <- nextElem(moliter) 68 print(get.property(mol, "cdk:Title")) 69} 70} 71} 72\seealso{ 73 \code{\link{view.molecule.2d}}, \code{\link{convert.implicit.to.explicit}} 74} 75\keyword{programming} 76 77\author{Rajarshi Guha (\email{rajarshi.guha@gmail.com})}