C# Syntax Highlighter
Apr 15

Printer Friendly Version

---
  Overview of main situations, technics and aspects of data extraction. Abstract layers where data extraction occurs. Split-Merge, Compress-Decompress and other complementary extract-recover actions. Model transformations and translations of object topologies. Extraction of parts from a composite document.  
---

Overview

We'll try to present both an overview and briefly describe the kind of methods, operations and levels for most common data extraction situations in modern software industry.

Simply stated, data extraction is the non-destructive operation that returns parts of existing computer data. Non-destructive in the sense the original data source content and format should not be altered in any way. All our projects will save extracted data into new storage units, we'll never replace or alter in any manner the source from which data is extracted.

Data extraction can require or be eventually follow by some additional transformation operations. If data needs to be exposed into a different format, we'll try to avoid adding new information. Depending on the format it is stored in or transferred over the network, data may need an initial restore operation, such as decoding, decompression or decryption. More on this in the Transform and Restore Pairs chapter.

Layered Architectures

OSI Model
OSI Model

The OSI (Open System Interconnection) Model defines seven abstract layers involved in data transfer over a network:

  • Physical layer is related to the kind of network adapter and hubs
  • Data Link layer adds error correction support to data transferred between network adapters.
  • Network layer groups data in package units, which are sent (and eventually re-sent) over the network. For TCP/IP, this layer describes the IP (Internet Protocol) part.
  • Transport layer provides transparent transfer of data between end users. For TCP/IP, this layer describes the TCP (Transmission Control Protocol) part.
  • Session layer controls the dialogues or connections between the computers involved.
  • Presentation layer is where most of the complementary "transform and restore" pairs of operations described here below are performed.
  • Application layer provides a common network interface to computer applications.

For data extraction, it is important to know which network layers your application controls, which operations you need to configure and in which format is your data returned by a specific layer.

Client-Server Architecture
Client-Server Architecture

Client-Server application modules can also implement the business logic at one of the following three abstract layers:

  • Data layer handles specific data extraction operations from the data source, which usually is a relational database, file or component.
  • Logical or Business Logic layer presents extracted data into a specific object model, where defined classes are rather mapped to the conceptual types of your application, than to the way data is stored.
  • Presentation layer handles the way data is exposed to the end-user.

In data extraction, all these three layers are important. At the presentation layer, we'll essentially focus on generic containers, HTML lists, object hierarchies, common elements that are not particular to one application kind, but to a very large range.

Data Extraction Layers
Data Extraction Layers

From a data extraction perspective, we'll also present a layered model which may better reflect how data is usually stored, extracted, transformed:

  • System layer relates to the raw data storage and transfer format, depending on the operating system or network type. Regardless of the actual meaning of data, this is seen as a sequence of bytes, and handled by basic system functions.
  • Structure layer adds a first meaning level to the business data, which is seen as structured in blocks with specific business-related significance. It's still a very abstract level, with headers, delimiters, cross-reference tables that contain actual end-user data.
  • Model layer is the object-oriented level of the business data. Previous storage structures are translated into actual end-user objects, with business meaning.
  • Presentation layer is not necessarily the way an end-user looks to data in an application, but the way data CAN be exposed and presented. This layer adds functionality to translate the hierarchy or net of objects stored at the Model level into different presentation formats: lists, layouts, diagrams...

Continue reading »

Subscribe and Share: Subscribe using any feed reader Bookmark and Share

Leave a Reply