/rcdk/man/loadmol.Rd

http://github.com/rajarshi/cdkr · Unknown · 77 lines · 72 code · 5 blank · 0 comment · 0 complexity · 514c875ae21e574d8ea3178938342622 MD5 · raw file

  1. \name{load.molecules}
  2. \alias{load.molecules}
  3. \alias{iload.molecules}
  4. \title{
  5. Load Molecular Structures From Disk
  6. }
  7. \description{
  8. The CDK can read a variety of molecular structure formats. This function
  9. encapsulates the calls to the CDK API to load a structure given its filename
  10. }
  11. \usage{
  12. load.molecules(molfiles=NA, aromaticity = TRUE, typing = TRUE, isotopes = TRUE,
  13. verbose=FALSE)
  14. iload.molecules(molfile, type="smi", aromaticity = TRUE, typing = TRUE, isotopes = TRUE,
  15. skip=TRUE)
  16. }
  17. \arguments{
  18. \item{molfiles}{A \code{character} vector of filenames. Note that the full
  19. path to the files should be provided. URL's can also be used as
  20. paths. In such a case, the URL should start with "http://"}
  21. \item{molfile}{A string containing the filename to load. Must be a local file}
  22. \item{type}{Indicates whether the input file is SMILES or SDF. Valid values are
  23. "smi" or "sdf"}
  24. \item{aromaticity}{If \code{TRUE} then aromaticity detection is
  25. performed on all loaded molecules. If this fails for a given
  26. molecule, then the molecule is set to NA in the return list}
  27. \item{typing}{If \code{TRUE} then atom typing is
  28. performed on all loaded molecules. The assigned types will be CDK
  29. internal types. If this fails for a given
  30. molecule, then the molecule is set to NA in the return list}
  31. \item{isotopes}{If \code{TRUE} then atoms are configured with isotopic masses}
  32. \item{verbose}{If TRUE, output (such as file download progress) will
  33. be bountiful}
  34. \item{skip}{If \code{TRUE}, then the reader will continue reading even when faced with an
  35. invalid molecule. If \code{FALSE}, the reader will stop at the fist invalid molecule}
  36. }
  37. \value{
  38. \code{load.molecules} returns a list of CDK \code{Molecule} objects, which can be
  39. used in other rcdk functions.
  40. \code{iload.molecules} is an iterating version of the loader and is applicable for
  41. large SMILES or SDF files. In contrast to \code{load.molecules} this does not load
  42. all the molecules into memory at one go, and as a result lets you process arbitrarily
  43. large structure files.
  44. }
  45. \details{
  46. Note that if molecules are read in from formats that do not have rules for
  47. handling implicit hydrogens (such as MDL MOL), the molecule will not have
  48. implicit or explicit hydrogens. To add explicit hydrogens, make sure that the molecule
  49. has been typed (this is \code{TRUE} by default for this function) and then call
  50. \code{\link{convert.implicit.to.explicit}}. On the other hand for a format
  51. such as SMILES, implicit or explicit hydrogens will be present.
  52. }
  53. \examples{
  54. \dontrun{
  55. ## load a single file
  56. amol <- load.molecules('foo.sdf')
  57. ## load multiple files
  58. mols <- load.molecules(c('mol1.sdf', 'mol2.smi',
  59. 'https://github.com/rajarshi/cdkr/blob/master/data/set2/dhfr00008.sdf?raw=true'))
  60. ## iterate over a large file
  61. moliter <- iload.molecules("big.sdf", type="sdf")
  62. while(hasNext(moliter)) {
  63. mol <- nextElem(moliter)
  64. print(get.property(mol, "cdk:Title"))
  65. }
  66. }
  67. }
  68. \seealso{
  69. \code{\link{view.molecule.2d}}, \code{\link{convert.implicit.to.explicit}}
  70. }
  71. \keyword{programming}
  72. \author{Rajarshi Guha (\email{rajarshi.guha@gmail.com})}