PageRenderTime 39ms CodeModel.GetById 14ms RepoModel.GetById 0ms app.codeStats 0ms

/lib/pkp/classes/filter/Filter.inc.php

https://github.com/lib-uoguelph-ca/ocs
PHP | 192 lines | 32 code | 13 blank | 147 comment | 3 complexity | 342685da5632d29a9d14b455dbc866b6 MD5 | raw file
Possible License(s): GPL-2.0
  1. <?php
  2. /**
  3. * @file classes/filter/Filter.inc.php
  4. *
  5. * Copyright (c) 2000-2012 John Willinsky
  6. * Distributed under the GNU GPL v2. For full terms see the file docs/COPYING.
  7. *
  8. * @class Filter
  9. * @ingroup filter
  10. *
  11. * @brief Class that provides the basic template for a filter. Filters are
  12. * generic data processors that take in a well-specified data type
  13. * and return another well-specified data type.
  14. *
  15. * Filters enable us to re-use data transformations between applications.
  16. * Generic filter implementations can sequence, (de-)multiplex or iterate
  17. * over other filters. Thereby filters can be nested and combined in many
  18. * different ways to form complex and easy-to-customize data processing
  19. * networks or pipelines.
  20. *
  21. * NB: This also means that filters only make sense if they accept and
  22. * return standardized formats that are understood by other filters. Otherwise
  23. * the extra implementation effort for a filter won't result in improved code
  24. * re-use.
  25. *
  26. * Objects from different applications (e.g. Papers and Articles) can first be
  27. * transformed by an application specific filter into a common format and then
  28. * be processed by application agnostic import/export filters or vice versa.
  29. * Filters can be used to pre-process data before it is indexed for search.
  30. * They also provide a framework to customize the processing applied in citation
  31. * parsing and lookup (i.e. which parsers and lookup sources should be applied).
  32. *
  33. * Filters can be used stand-alone outside PKP applications.
  34. *
  35. * The following is a complete list of all use-cases that have been identified
  36. * for filters:
  37. * 1) Decode/Encode
  38. * * import/export: transform application objects (e.g. an Article object)
  39. * into structured (rich) data formats (e.g. XML, OpenURL KEV, CSV) or
  40. * vice versa.
  41. * * parse: transform unstructured clob/blob data (e.g. a Word Document)
  42. * into application objects (e.g. an Article plus Citation objects) or
  43. * into structured data formats (e.g. XML).
  44. * * render: transform application objects or structured clob/blob data into
  45. * an unstructured document (e.g. PDF, HTML, Word Document).
  46. *
  47. * 2) Normalize
  48. * * lookup: compare the data of a given entity (e.g. a bibliographic
  49. * reference) with data from other sources (e.g. CrossRef) and use this
  50. * to normalize data or improve data quality.
  51. * * harvest: cleanse and normalize incoming meta-data
  52. *
  53. * 3) Map
  54. * * cross-walk: transform one meta-data format into another. Meta-data
  55. * can be represented as structured clob/blob data (e.g. XML) or as
  56. * application objects (i.e. a MetadataRecord instance).
  57. * * meta-data extraction: retrieve meta-data from OO entities
  58. * (e.g. an Article) into a standardized meta-data record (e.g. NLM
  59. * element-citation).
  60. * * meta-data injection: inject data from a standardized meta-data
  61. * record into application objects.
  62. *
  63. * 4) Convert documents
  64. * * binary converters: wrap binary document converters (e.g. antidoc) in
  65. * a well-defined and re-usable way.
  66. *
  67. * 5) Search
  68. * * indexing: pre-process data (extract, tokenize, remove stopwords,
  69. * stem) for indexing.
  70. * * finding: pre-process queries (parse, tokenize, remove stopwords,
  71. * stem) to access the index
  72. */
  73. // $Id$
  74. class Filter {
  75. /** @var string */
  76. var $_displayName;
  77. /**
  78. * Constructor
  79. */
  80. function Filter() {
  81. }
  82. //
  83. // Setters and Getters
  84. //
  85. /**
  86. * Set the display name
  87. * @param $displayName string
  88. */
  89. function setDisplayName($displayName) {
  90. $this->_displayName = $displayName;
  91. }
  92. /**
  93. * Get the display name
  94. * @return string
  95. */
  96. function getDisplayName() {
  97. return $this->_displayName;
  98. }
  99. //
  100. // Abstract template methods to be implemented by subclasses
  101. //
  102. /**
  103. * Returns true if the given input and output
  104. * objects represent a valid transformation
  105. * for this filter.
  106. *
  107. * This check must be type based. It can
  108. * optionally include an additional stateful
  109. * inspection of the given object instances.
  110. *
  111. * If the output type is null then only
  112. * check whether the given input type is
  113. * one of the input types accepted by this
  114. * filter.
  115. *
  116. * @param $input mixed
  117. * @param $output mixed
  118. * @return boolean
  119. */
  120. function supports(&$input, &$output) {
  121. assert(false);
  122. }
  123. /**
  124. * This method performs the actual data processing.
  125. * NB: sub-classes must implement this method.
  126. * @param $input mixed validated filter input data
  127. * @return mixed non-validated filter output or null
  128. * if processing was not successful.
  129. */
  130. function &process(&$input) {
  131. assert(false);
  132. }
  133. //
  134. // Public methods
  135. //
  136. /**
  137. * Returns true if the given input is supported
  138. * by this filter. Otherwise returns false.
  139. *
  140. * NB: sub-classes will not normally override
  141. * this method.
  142. *
  143. * @param $input mixed
  144. * @return boolean
  145. */
  146. function supportsAsInput(&$input) {
  147. $nullVar = null;
  148. return($this->supports($input, $nullVar));
  149. }
  150. /**
  151. * Filters the given input.
  152. *
  153. * Input and output of this method will
  154. * be tested for compliance with the filter
  155. * definition.
  156. *
  157. * NB: sub-classes will not normally override
  158. * this method.
  159. *
  160. * @param mixed an input value that is supported
  161. * by this filter
  162. * @return mixed a valid return value or null
  163. * if an error occurred during processing
  164. */
  165. function &execute(&$input) {
  166. // Validate the filter input
  167. if (!$this->supportsAsInput($input)) {
  168. $output = null;
  169. return $output;
  170. }
  171. // Process the filter
  172. $output =& $this->process($input);
  173. // Validate the filter output
  174. if (is_null($output) || !$this->supports($input, $output)) $output = null;
  175. // Return processed data
  176. return $output;
  177. }
  178. }
  179. ?>