Store/Data Driver Algorithms used

  • RAM & Disk virtualization - SOP store virtualizer fully optmizes usage of memory & disk storage providing applications the best chance to encache in-memory objects that reside on disk. SOP can do fine grained optimal data page swapping between RAM & Disk on demand and based on measured utilization (see MRU algorithm). Fine grained meaning SOP will not suddenly swap in/out to/from disk big data segments some of which are not demanded or for use by the application, which results in degraded performance due to unnecessary latencies. SOP knows based on application usage which chunk of data is requested by the app and needs to be loaded to memory. Overall performance output of this technique fully optimized (threads) RAM & Disk utilization resulting in break-through, in-memory dictionary performance levels as data are retrieved from standard in-memory dictionary & no sort of mapping (of Ids nor anything) in between. Also, this seamless virtualization provides a single Store (an extended IDictionary) & Object context removing need for application developers to translate contexts between the application & the backend DB stores. For example, lambda expressions parameter for Querying objects operate in the context of the application POCO, there is no LINQ to objects and LINQ to SQL expression context switching. Developers author expressions that manipulate/compare POCOs utilizing all the framework, app & 3rd party library functionalities available to the application, no translation required. This is the simplest to use & most powerful feature when it comes to manipulating the data Store including Querying or processing Store POCOs.
  • MUM induced MRU - Memory Utilization Metering (MUM) induced MRU. Items are stored in-memory and offloaded to disk based on utilization. Most Recently Used items tend to stay in-memory and least used items offloaded to disk on times of heavy usage/load. The Store's MUM drives MRU module to keep loaded in-memory items at a manageable level as set in the preference.
  • Bulk I/O optimized algorithm - B-Tree nodes tend to occupy contiguous space on disk and serve as a grouping unit for high speed, async, bulk I/O. Multiple Nodes tend to get grouped together and inserted/saved to disk in bulk. Reading a non-fragmented, contiguous Node "pagefull" of data in async mode proved as high performant.
  • Flexible, low-level data blocks structure - SOP employs a low-level, basic block structure that can accomodate any higher level constructs/structures, and same time, allows SOP's data driver to implement a simple, high performant "deleted items space" recycling that proved to provide, non-degrading performance in numerous/frequent management actions such as Insert, Deletes. E.g. - this feature proved critical functionality in implementation of "scalable, non-degrading!" EF 2nd level caching provider using SOP.
  • Open persistence method - SOP comes with three built-in persistence methods, i.e. - Basic data type persistence using BinaryReader/BinaryWriter, Xml Serialized POCOs and SOP's proprietary via IPersistent interface implementation. However, programmers can define their own persistence mechanism if needed. Persistence implementation dictates support or no support for auto-versioning of POCOs. Xml Serialized POCOs persistence method for example supports version independence as supported by Xml Serialization.
  • Designed to capture, re-surface and allow customization of data persistence behaviors. Today, the SOP data Store API constructs were laid out to allow programmers to have fine-grained control on data management and queries. We've also successfully re-surfaced "profile" artifacts that allow programmers to fine-tune Store performance vs. resource utilization to address the application domain requirements. Next releases of SOP will allow further refinements in this area. I'd like to offer full fidelity (via configuration and/or programmability) on persistence data flows that affect scaling.
  • Modular and Open architecture - we've worked hard to keep the SOP architecture open and flexible so we can add up higher level implementations with ease. Today we've solved most fundamental issues in data persistence and tomorrow we'd like to implement high level constructs to address Enterpsie scalability use-cases. We've designed something in the spirit of "framework modelling" in data persistence in hopes to keep the data persistence "stack" open and flexible. Example, to my mind optimizations/scaling such as horizontal sharding feature is a very simple functionality given the availability of a good and open data persistence framework. Programmers should be able to implement such functionality specific to a domain or for general purpose quite easily.
  • Application level caching - unlike traditional embedded database systems, SOP employs POCO caching directly in your application. We've removed unnecessary translation layers and this yielded us with minimal serialization overheads, thus, allowing SOP to break performance barriers, never before achieved in this kind of a solution.
  • High speed transaction - all changes are tracked by a high speed transaction module which does Copy On Write (COW) backup of data segments to a transaction backup file, which is used during rollback. SOP data files are protected to the fullest, host PC (or your application) can crash at any point in time within the transaction and SOP guarantees being able to restore back the data files to a previous committed state.
  • Object Store pooling - SOP's StoreFactory provides object Store pooling functionality for minimizing latencies due to opening, closing data files and re-populating the in-memory MRU cache.
  • SOP's .Net optimizations in the area of "unbuffered I/O" was inspired by findings and techniques described by Jim Grey(& team) in his paper "Sequential File Programming Patterns and Performance with .NET" http://research.microsoft.com/apps/pubs/default.aspx?id=64538.

These algorithms and architecture designs combined, yielded SOP data Store performance, very low footprint in memory, CPU, disk utilization AND open-ness that can prove to be a great addition to your enterprise data solution toolbox.

Last edited Jan 9, 2013 at 6:33 AM by grecinto, version 1

Comments

No comments yet.