PageRenderTime 434ms CodeModel.GetById 416ms RepoModel.GetById 0ms app.codeStats 0ms

/docs/dbmodel.rtf

https://code.google.com/p/parsets/
Unknown | 143 lines | 130 code | 13 blank | 0 comment | 0 complexity | 5738de54ca8da9ae7d3aa90290737d8b MD5 | raw file
  1. {\rtf1\ansi\ansicpg1252\cocoartf949\cocoasubrtf460
  2. {\fonttbl\f0\fswiss\fcharset0 ArialMT;}
  3. {\colortbl;\red255\green255\blue255;}
  4. {\*\listtable{\list\listtemplateid1\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc2\leveljcn2\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid0\'02\'05.;}{\levelnumbers\'01;}}{\listname ;}\listid1}
  5. {\list\listtemplateid2\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc2\leveljcn2\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid0\'02\'05.;}{\levelnumbers\'01;}}{\listname ;}\listid2}
  6. {\list\listtemplateid3\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc2\leveljcn2\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid0\'02\'05.;}{\levelnumbers\'01;}}{\listname ;}\listid3}
  7. {\list\listtemplateid4\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc2\leveljcn2\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid0\'02\'05.;}{\levelnumbers\'01;}}{\listname ;}\listid4}
  8. {\list\listtemplateid5\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc2\leveljcn2\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid0\'02\'05.;}{\levelnumbers\'01;}}{\listname ;}\listid5}
  9. {\list\listtemplateid6\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc2\leveljcn2\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{disc\}}{\leveltext\leveltemplateid0\'02\'05.;}{\levelnumbers\'01;}}{\listname ;}\listid6}}
  10. {\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}{\listoverride\listid2\listoverridecount0\ls2}{\listoverride\listid3\listoverridecount0\ls3}{\listoverride\listid4\listoverridecount0\ls4}{\listoverride\listid5\listoverridecount0\ls5}{\listoverride\listid6\listoverridecount0\ls6}}
  11. \margl1440\margr1440\vieww11040\viewh11660\viewkind0
  12. \pard\tx560\tx1120\tx1680\tx2240\tx2800\tx3360\tx3920\tx4480\tx5040\tx5600\tx6160\tx6720\ql\qnatural\pardirnatural
  13. \f0\fs22 \cf0 \
  14. \i This document is slightly out of sync with the code. It does describe the overall idea, though.
  15. \i0 \
  16. \
  17. \pard\pardeftab720\ql\qnatural
  18. \b \cf0 ParSets NG Local Database Model
  19. \b0 \
  20. ParSets's local db is stored in a single SQLite database file called local.db. This document describes the layout of this database and the Java classes used to interact with it.\
  21. \
  22. The overall structure of the db is similar to the OLAP star model. There are two kinds of tables: administration tables and data tables. Each data set has entries in the administration tables, and one or two data tables.\
  23. \
  24. \b Administration Tables
  25. \b0 \
  26. \
  27. \b Admin_DataSets
  28. \b0 \
  29. Contains a list of all data sets in the data base, their names, etc.\
  30. \pard\tx220\tx720\pardeftab720\li720\fi-720\ql\qnatural
  31. \ls1\ilvl0\cf0 {\listtext \'95 }
  32. \i Name
  33. \i0 - human-readable name of the data set\
  34. {\listtext \'95 }
  35. \i Handle
  36. \i0 - internal name of the data set, derived from the filename or the human-readable name. Can only contain lowercase a-z and 0-9, no punctuation or spaces\
  37. {\listtext \'95 }
  38. \i Section
  39. \i0 - text field that can contain the name of a section the dataset is to be listed in. Not currently used in the ParSets program.\
  40. {\listtext \'95 }
  41. \i DateAdded
  42. \i0 - date and time the data set was added to the data base\
  43. \pard\tx560\pardeftab720\ql\qnatural
  44. \cf0 \
  45. \b Admin_Dimensions
  46. \b0 \
  47. List all dimensions in all data sets. The handle of the data set is the foreign key to get the list of dimensions.\
  48. \pard\tx220\tx720\pardeftab720\li720\fi-720\ql\qnatural
  49. \ls2\ilvl0\cf0 {\listtext \'95 }
  50. \i DataSet
  51. \i0 - handle of the data set the dimension belongs to\
  52. {\listtext \'95 }Name - human-readable name of the dimension\
  53. {\listtext \'95 }Handle - internal name of the dimension, perhaps taken from the column header of the imported file. Follows the same rules as data set handle. Has to be unique within the data set.\
  54. {\listtext \'95 }Type - type of the dimension. Dimensions that are considered dimensions in the OLAP model are called "categorical" here. See list of types for measures below.\
  55. \pard\pardeftab720\ql\qnatural
  56. \cf0 \
  57. \b Admin_Categories
  58. \b0 \
  59. List all categories in all data sets. The handles of the data set and the dimension are the foreign keys here.\
  60. \pard\tx220\tx720\pardeftab720\li720\fi-720\ql\qnatural
  61. \ls3\ilvl0\cf0 {\listtext \'95 }DataSet - handle of the data set the category belongs to\
  62. {\listtext \'95 }Dimension - handle of the dimension the category belongs to\
  63. {\listtext \'95 }Name - human-readable name of the category\
  64. {\listtext \'95 }Handle - handle of the category, either from the file or derived from name. Same rules as data set handle.\
  65. {\listtext \'95 }Number - number of the category within the dimension. This number has to be greater than 0 and is used for all internal calculations, as well as for indexing into the dimensions and data tables\
  66. \pard\pardeftab720\ql\qnatural
  67. \cf0 \
  68. \b Admin_DerivedDimensions
  69. \b0 \
  70. Table for specifying user-defined derived dimensions. Not currently implemented.\
  71. \
  72. \b Data Tables
  73. \b0 \
  74. Each data set has one or two data tables associated with it, named
  75. \i handle
  76. \i0 _dims and
  77. \i handle
  78. \i0 _measures. The dims table has to exist for each data set, the measures table is optional.\
  79. \
  80. \pard\pardeftab720\ql\qnatural
  81. \i\b \cf0 handle
  82. \i0 _dims
  83. \b0 \
  84. This table contains all the categorical dimensions as dimensions, and has an entry for each combination of categories that exists in the data set. In addition to the dimensions, the table has one column for the key of that combination (stored as a hexadecimal string because it can be larger than 64 bits) and the count of how many times each combination of categories exists.\
  85. There are also pre-aggregated entries that use 0 for each combination of categories, including one entry with all zeros giving the size of the entire data set.\
  86. \pard\tx220\tx720\pardeftab720\li720\fi-720\ql\qnatural
  87. \ls4\ilvl0\cf0 {\listtext \'95 }Key - the key value combining the numerical category values for all categories. Also used to connect to the measures.\
  88. {\listtext \'95 }Count - the number of times that combination of values occurs in the data set\
  89. {\listtext \'95 }Dimensions - one column for each dimension, using the dimension's handle as its name. The type of these columns is numerical.\
  90. \pard\pardeftab720\ql\qnatural
  91. \cf0 \
  92. \pard\pardeftab720\ql\qnatural
  93. \i\b \cf0 handle
  94. \i0 _measures
  95. \b0 \
  96. This table contains all numerical data dimensions, or measures. The only column all these tables have in common is the key, which is the same as in the
  97. \i handle
  98. \i0 _dims table. All other columns represent numerical data dimensions, as specified in the Admin_Dimensions table. The type of the column depends on the numerical type (see below).\
  99. \pard\tx220\tx720\pardeftab720\li720\fi-720\ql\qnatural
  100. \ls5\ilvl0\cf0 {\listtext \'95 }Key - the key from
  101. \i handle
  102. \i0 _dims specifying which combination of categories the row belongs to. This does not have to be unique, many numerical rows can correspond to the same combination of categorical values. Also includes the pre-aggregatd keys, values in each dimension are summed up.\
  103. {\listtext \'95 }Measures - one column for each numerical dimension in the data set.\
  104. \pard\pardeftab720\ql\qnatural
  105. \cf0 \
  106. \
  107. \b Measure Data Types
  108. \b0 \
  109. \pard\tx220\tx720\pardeftab720\li720\fi-720\ql\qnatural
  110. \ls6\ilvl0\cf0 {\listtext \'95 }Integer - whole numbers, using the range of a 32-bit integer (Java type int)\
  111. {\listtext \'95 }Real - floating-point numbers, stored as single-precision floats (Java type float)\
  112. {\listtext \'95 }Date - Dates (potentially including time) stored as a string, using SQLIte's "native" time format, which mostly conforms to RFC 3339, e.g. "1990-12-31T23:59:60Z" - time zone information is ignored (Java type Calendar)\
  113. \pard\pardeftab720\ql\qnatural
  114. \cf0 \
  115. \
  116. \b Useful Resources
  117. \b0 \
  118. \
  119. SQLite, {\field{\*\fldinst{HYPERLINK "http://sqlite.org/"}}{\fldrslt http://sqlite.org/}}\
  120. \
  121. Good description of OLAP Star Schema,\
  122. \pard\pardeftab720\ql\qnatural
  123. {\field{\*\fldinst{HYPERLINK "http://ciobriefings.com/Publications/WhitePapers/DesigningtheStarSchemaDatabase/tabid/101/Default.aspx"}}{\fldrslt \cf0 http://ciobriefings.com/Publications/WhitePapers/DesigningtheStarSchemaDatabase/tabid/101/Default.aspx}}\
  124. \
  125. SQLite Date and Time Functions, {\field{\*\fldinst{HYPERLINK "http://sqlite.org/lang_datefunc.html"}}{\fldrslt http://sqlite.org/lang_datefunc.html}}\
  126. \
  127. RFC 3339, Date and Time on the Internet: Timestamps\
  128. \pard\pardeftab720\ql\qnatural
  129. {\field{\*\fldinst{HYPERLINK "http://www.faqs.org/rfcs/rfc3339.html"}}{\fldrslt \cf0 http://www.faqs.org/rfcs/rfc3339.html}}\
  130. }