/trunk/src/kangmodb/table.h
C Header | 249 lines | 113 code | 21 blank | 115 comment | 4 complexity | 8750903ef59496bb4e775f2e0478f99a MD5 | raw file
Possible License(s): BSD-3-Clause
1/** 2 * table.h 3 * kangmodb 4 * 5 * Created by 강모 김 on 11. 5. 1.. 6 * Copyright 2011 강모소프트. All rights reserved. 7 * 8 * Design of "Restart Recovery": 9 * (1) Restart Recovery Speed Optimization 10 * - To boost the restart recovery process, stgTable keeps (key, data) pairs in shared memory chunks allocated from stgSharedRegion. 11 * - Log records of active transactions are also kept in stgTransLogBuffer which allocates shared memory chunks from stgSharedRegion. 12 * c.f.> stgTableMgr manages the strSharedRegion object for all tables. 13 * 14 * (2) Power Failure or OS crash - We don't have shared memory regions. 15 * - Redo all log records in the log stream where only committed transactions send its own log buffer. 16 * - No need to undo any log records, because all transaction logs in the log stream are from committed transactions only. 17 * 18 * (3) Process Failure - We have shared memory regions. 19 * (2.1) Phase I - Reconstruct stgTableMgr and stgTable objects from the shared memory region. 20 * - Iterate shared memory chunks for stgTable objects to find out root chunk of each table. 21 * - Root chunk is one of shared memory chunks allocated by stgSharedRegion, and becomes the start point for searching (key,data) pair. 22 * - At this point, we can open a cursor for each table, but some tables may have uncommitted modifications by active transactions. 23 * (2.2) Phase II - Reconstruct stgTransLogBuffer objects of active transactions from the shared memory region, rollback all logs. 24 * - Iterate shared memory chunks for stgTransLogBuffer to find out root chunk of each transaction log buffer. 25 * - Root chunk is one of shared memory chunks allocated by stgSharedRegion, and becomes the start point for traversing log records. 26 * - Revert all new versions of (key, data) pairs made by these active transactions by traversing keys stored in log records. 27 * (2.3) Phase III - Get the maxCommitVersion from the shared memory region, set it to maxCommitVersion_ member in stgTransMgr. 28 * - stgTransMgr stores maxCommitVersion in the shared memory region. 29 * 30 * Design of "DATA Versioning on a KEY" : 31 * - stgTable stores (KEY, DATA) pair. Duplicate key value is not allowed, so a KEY value is unique in a table. 32 * - Each DATA has following flags 33 * - (01bit) deleted 34 * - Indicates the (KEY, DATA) pair is deleted 35 * - (32bit) savepointID 36 * - The savepoint ID. A new version of DATA is created only when the savepoint ID changes. 37 * - Update/Delete with the same savepoint ID does not create a new version, but it is in-place updated. 38 * - (64bit) commitVersion 39 * - The commit version number. 0 indicates it is not committed yet. A number greater than 0 indicates it is committed. 40 * - The version number increases monotonously. Transactions allocate the commit version number from stgTransMgr. 41 * - savepointID can share data storage with commitVersion to optimize memory space. 42 * 43 * - Insertion : (KEY, DATA). 44 * - stgTable inserts the KEY into access methods such as skip lists, and then dangles the DATA onto the key with the default savepoint 0(s0), 45 * but the commit version is set to 0(c0) indicating it is not committed yet. 46 * - (KEY, DATA-s0-c0) 47 * - Upon tranasction commit, the transaction allocates a commit version(say 1234) from stgTransMgr, sets it to DATA indicating it is committed. 48 * - After setting commit version to all (KEY,DATA) pairs that the transaction modified, call stgTransMgr::commitVersion to notify the transaction completed setting the commit version to all modified (KEY,VALUE) pair. 49 * - This allows other transactions to read the committed (KEY,DATA) pairs. 50 * - (KEY, DATA-c1234) 51 * - When the key is updated, a new version of data, data_s0 is added. (key, data-s0-c0, data ) 52 * - When the transaction commits, it sets the commit version to the new version (key, data-c1234, data ) 53 * - When the key is deleted, a new version of data, data_s0 is added with its delete bit set. (key, data-deleted-s0-c0, data) 54 * - When the transaction commits, it sets the commit version to the new version (key, data-deleted-c1234, data ) 55 * 56 * Design of "Rollback to Savepoint" : 57 * - Only one transaction can create a new version of DATA on a KEY. (Say, modifying transaction) 58 * - In case the modifying transaction updates, deletes, or inserts DATA on a KEY multiple times, 59 * no new version is created, but in-place update is done on the first new version of the data. 60 * - However, when a savepoint number of a transaction increases because the transaction allocated a new savepoint, 61 * a new version of data is is created for the savepoint when the transaction changes DATA on the KEY. 62 * - (KEY, DATA-s1-c0, DATA-s0-c0 ) ; Savepoint number increase from s0 to s1. A dedicated new version for s1, DATA-s1-c0 is created. 63 * - This is to help "Rollback to Savepoint". Rolling back to a savepoint simply removes all new versions created after the savepoint. 64 * - (KEY, DATA-s0-c0) ; Rollback to Savepoint s0 removed the version DATA-s1-c0. 65 * - How to iterate all KEYs that the transaction touched? 66 * - We can interate log records in stgTransLogBuffer in reverse order until we meet the log for Savepoint s1. 67 * - Each INSERT, UPDATE, DELETE log has the KEY value, so we can search the access method with the KEY. 68 * - After removing all versions created since Savepoint s1, stgTransLogBuf is truncated at the position that Savepoint s1 log exists. 69 * 70 * Design of "Concurrency control for updating transactions on the same KEY " : 71 * - Other transactions that want to access the same KEY whose DATA is modified by another transaction need to rollback. 72 * - After the rollback, they need to wait until the modifying transaction does commit or rollback. 73 * - And then, they get the new viewVersion, start accessing the (key, data) pair again. 74 * - If no transaction created a new version of the pair yet, a transaction can create one for it. 75 * - CAS(Compare and Swap) operation is used for implementing the concurrency control 76 * to check whether another transaction has created a new version. 77 * - Applications need to be aware of this process. 78 * - Application programmers need to write the code to begin the transaction again, modify the same set of tables and (key, value) pairs again. 79 */ 80 81#ifndef _KD_TABLE_H_ 82#define _KD_TABLE_H_ (1) 83 84#include "kdInfra.h" 85#include "types.h" 86 87#include "transMgr.h" 88#include "transaction.h" 89#include "set.h" 90#include "data.h" 91#include "chunkList.h" 92 93/** @brief The table of DATA versions on each KEY pairs. This is the interface called by both normal processing and restart recovery. 94 */ 95class stgTable 96{ 97private : 98 set_t * set_; 99 stgChunkList chunks_; 100public : 101 stgTable() 102 { 103 } 104 ~stgTable() 105 { 106 } 107 108 /** @brief Initialize the table object with the given tableId. 109 */ 110 KD_VOID initialize(int tableId) 111 { 112 KD_TRY 113 { 114 set_ = set_alloc(); 115 // TODO : Check if set allocation failed. 116 } 117 KD_CATCH 118 KD_FINALLY 119 KD_END 120 } 121 122 /** @brief Destroy the table object. 123 */ 124 KD_VOID destroy() 125 { 126 KD_TRY 127 { 128 // TODO : Destroy set_ object. 129 } 130 KD_CATCH 131 KD_FINALLY 132 KD_END 133 } 134 135 /** @brief Find the data descriptor associated with the given key. 136 * @param dataDesc *dataDesc is set to the found data descriptor. Set *dataDesc to NULL if the key is not found. 137 */ 138 KD_VOID getDataDesc(const stgKey & key, stgDataDesc ** dataDesc ) const 139 { 140 KD_TRY 141 { 142 stgDataDesc * foundData; 143 144 foundData = (stgDataDesc*) set_lookup(set_, key); 145 146 *dataDesc = foundData; 147 } 148 KD_CATCH 149 KD_FINALLY 150 KD_END 151 } 152 153 /** @brief If the key exists, simply return the associated data descritor. Otherwise allocate a new data descriptor, associate it with the given key, put it into an access method. 154 * @param dataDesc The data descriptor associated with the given key. 155 */ 156 KD_VOID putDataDesc(const stgKey & key, stgDataDesc ** dataDesc ) 157 { 158 KD_TRY 159 { 160 stgDataDesc * newDataDesc ; 161 stgDataDesc * existingDataDesc ; 162 163 // TODO : Allocated newDataDesc from the shared memory chunk. 164 KD_ASSERT(0); 165 166 // TODO : Need allocate memory for stgKey::key_, copy the key content. Need to think about where we can free the memory. 167 KD_ASSERT(0); 168 169 // TODO : Think about what to do when two transactions concurrently try to call this function. 170 existingDataDesc = (stgDataDesc*) set_update( set_, key, newDataDesc, 0 /* overwrite */ ); 171 172 // If the key already exists, throw KD_EXCP_KEY_EXISTS. 173 if ( existingDataDesc ) 174 { 175 // TODO : Free newDataDesc from the shared memory chunk. 176 KD_ASSERT(0); 177 *dataDesc = existingDataDesc; 178 } 179 else 180 { 181 *dataDesc = newDataDesc; 182 } 183 } 184 KD_CATCH 185 KD_FINALLY 186 KD_END 187 } 188 189 /** @brief Find the latest data version whose version is less than or equal to the given viewVersion. 190 * @param viewVersion : The maximum version that the transaction can view. Set it to MAX_DATA_VERSION to see uncomitted changes. 191 */ 192 KD_VOID seekData(const stgVersion viewVersion, stgDataDesc * dataDesc, stgData * data ) 193 { 194 KD_TRY 195 { 196 stgDataVersion * dataVersion; 197 198 KD_CALL( dataDesc->findLatestVersion( viewVersion, & dataVersion) ); 199 if ( dataVersion == NULL ) 200 // The key is found, but this transaction can't see it or it is marked as deleted. 201 KD_THROW( KD_EXCP_KEY_NOT_FOUND ); 202 if ( dataVersion->deleted ) 203 // The latest data version is marked as deleted. 204 KD_THROW( KD_EXCP_KEY_NOT_FOUND ); 205 206 // No data copy happens. Simply copy the data pointer and set the length. 207 *data = stgData( dataVersion->data, dataVersion->dataLength ); 208 } 209 KD_CATCH 210 KD_FINALLY 211 KD_END 212 } 213 214 /** @brief Update the data latest data version within the given data descriptor. 215 * If the latest uncomitted data version matches the current savepoint ID in tx, in-place update the data. 216 * Otherwise create a new data version, set savepoint ID of the new version to the current one of tx, copy the data into the new version. 217 */ 218 KD_VOID updateData(const stgVersion viewVersion, const stgSavepointId spID, stgDataDesc * dataDesc, const stgData & data ) 219 { 220 KD_TRY 221 { 222 // TODO : Continue to implement. 223 KD_ASSERT(0); 224 } 225 KD_CATCH 226 KD_FINALLY 227 KD_END 228 } 229 230 /** @brief Set the deleted bit of the latest data version within the given data descriptor. 231 * If the latest uncomitted data version matches the current savepoint ID in tx, set the deleted bit. 232 * Otherwise create a new data version, set savepoint ID of the new version to the current one of tx, set the deleted bit. 233 * @deletedBit : true if deleted, false otherwise. 234 */ 235 KD_VOID setDeletedBit(const stgVersion viewVersion, const stgSavepointId spID, stgDataDesc * dataDesc, bool deletedBit ) 236 { 237 KD_TRY 238 { 239 // TODO : Implement 240 KD_ASSERT(0); 241 } 242 KD_CATCH 243 KD_FINALLY 244 KD_END 245 } 246 247}; 248 249#endif /* _KD_TABLE_MGR_H_ */