Data Warehouse Tools
The tools that allow sourcing of data contents and formats accurately and external data stores into the data warehouse have to perform several essential tasks that contain:
- Data consolidation and integration.
- Data transformation from one form to another form.
- Data transformation and calculation based on the function of business rules that force transformation.
- Metadata synchronization and management, which includes storing or updating metadata about source files, transformation actions, loading formats, and events.
There are several selection criteria which should be considered while implementing a data warehouse:
- The ability to identify the data in the data source environment that can be read by the tool is necessary.
- Support for flat files, indexed files, and legacy DBMSs is critical.
- The capability to merge records from multiple data stores is required in many installations.
- The specification interface to indicate the information to be extracted and conversation are essential.
- The ability to read information from repository products or data dictionaries is desired.
- The code develops by the tool should be completely maintainable.
- Selective data extraction of both data items and records enables users to extract only the required data.
- A field-level data examination for the transformation of data into information is needed.
- The ability to perform data type and the character-set translation is a requirement when moving data between incompatible systems.
- The ability to create aggregation, summarization and derivation fields and records are necessary.
- Vendor stability and support for the products are components that must be evaluated carefully.
Data Warehouse Software Components
A warehousing team will require different types of tools during a warehouse project. These software products usually fall into one or more of the categories illustrated, as shown in the figure.
Extraction and Transformation
The warehouse team needs tools that can extract, transform, integrate, clean, and load information from a source system into one or more data warehouse databases. Middleware and gateway products may be needed for warehouses that extract a record from a host-based source system.
Software products are also needed to store warehouse data and their accompanying metadata. Relational database management systems are well suited to large and growing warehouses.
Data access and retrieval
Different types of software are needed to access, retrieve, distribute, and present warehouse data to its end-clients.