Tuesday, February 19, 2013

Virtualisation for SPI & SPF

In the previous post I discussed the effect that database structure has on reporting, explaining how normalisation and optimisation often results in a more complex database, making it more difficult to extract the required data. Today I'm going to look more deeply at the SPI and SPF databases and how we can simplify reporting using the Schema Virtualisation technology in APS Reporter/SP.

Warning! SQL Statements Ahead....Proceed at your own risk!


Most applications recognise that users will want to produce reports in a wide variety of formats, with different combinations of data, and they will usually provide a rudimentary reporting system. In the good ol' days, databases were generally simple and if the application didn't provide the reports you needed it was easy enough to grab your copy of MS-Access or Crystal Reports and produce your own reports, perhaps after an hour or two to get your mind around the database.

The ubiquitous Query Builder window, found in almost all database management applications (except perhaps those favoured by certain Oracle command-line die-hards), was the standard tool for extracting data. The Query Builder, and the SQL statements produced, were effectively standardised and once you'd learnt how to produce a Query for one database, moving to another database was straight-forward and even switching applications and database platforms (eg Oracle to MS-SQL) was not usually a problem.

Progressively, as databases evolved and become more complex, options for extracting data and producing reports became more limited. External tools that had been useful became more difficult to use, users required a deeper understanding of the applications databases, and had to rely entirely on the tools provided by the application. Specialists with expert-level skills and knowledge of the arcane secrets of each database were required.
Ironically, as more and more data is stored in databases, the data became effectively hidden behind behind a wall of it's own complexity.
With APS Reporter/SP we wanted to open up these databases again. SQL, and the Query Builder, are still the best way to manipulate large data sets, whether you're extracting data for reports or exporting. The challenge was to hide the complexity of each database from the user, while still allowing them to work with the traditional Query Builder window and SQL.

Schema Virtualisation technology presents the user with a view of the database that appears to be a standard relational database "de-normalised". Rather than a design optimised for performance, it is optimised for reporting. Fields appear in tables where the user would intuitively expect to find them, or where they would be most convenient, and tables are arranged to represent items the user is familiar with, rather than the software developer.

SPI Instrument Index Query simplified with Virtualisation
For example, for an Instrument Index in SPI, users would require:
Instrument Tag, Service Description, Instrument Type, Status, I/O Type, Manufacturer, Model, Location, Certification, Loop Name, P&ID
These are all logically related to the Instrument, but are actually stored in 10 separate tables in SPI (for excellent integrity & performance reasons). To produce a simple report requires a Query as shown on the left above, or in the SQL statement below:
Select
  COMPONENT.CMPNT_NAME,
  COMPONENT.CMPNT_SERV,
  COMPONENT_FUNCTION_TYPE.CMPNT_FUNC_TYPE_NAME,
  COMPONENT_FUNCTION_TYPE.CMPNT_FUNC_TYPE_DESC,
  COMPONENT_HANDLE.CMPNT_HANDLE_NAME,
  COMPONENT_SYS_IO_TYPE.CMPNT_SYS_IO_TYPE_NAME,
  COMPONENT_MFR.CMPNT_MFR_NAME,
  COMPONENT_MOD.CMPNT_MOD_NAME,
  COMPONENT_LOCATION.CMPNT_LOC_NAME,
  COMPONENT_CERTIFICATION.CMPNT_CERTIF_NAME,
  LOOP.LOOP_NAME,
  DRAWING.DWG_NAME
From
  COMPONENT
Inner Join COMPONENT_FUNCTION_TYPE
  On COMPONENT.CMPNT_FUNC_TYPE_ID = COMPONENT_FUNCTION_TYPE.CMPNT_FUNC_TYPE_ID
Inner Join COMPONENT_MFR
  On COMPONENT.CMPNT_MFR_ID = COMPONENT_MFR.CMPNT_MFR_ID
Inner Join COMPONENT_MOD
  On COMPONENT.CMPNT_MOD_ID = COMPONENT_MOD.CMPNT_MOD_ID
Inner Join COMPONENT_SYS_IO_TYPE
  On COMPONENT.CMPNT_SYS_IO_TYPE_ID = COMPONENT_SYS_IO_TYPE.CMPNT_SYS_IO_TYPE_ID
Inner Join COMPONENT_LOCATION
  On COMPONENT.CMPNT_LOC_ID = COMPONENT_LOCATION.CMPNT_LOC_ID
Inner Join COMPONENT_HANDLE
  On COMPONENT.CMPNT_HANDLE_ID = COMPONENT_HANDLE.CMPNT_HANDLE_ID
Inner Join COMPONENT_CERTIFICATION
  On COMPONENT.CMPNT_CERTIF_ID = COMPONENT_CERTIFICATION.CMPNT_CERTIF_ID
Inner Join LOOP
  On COMPONENT.LOOP_ID = LOOP.LOOP_ID
Inner Join DRAWING
  On COMPONENT.DWG_ID = DRAWING.DWG_ID
The red text indicates the links between the primary COMPONENT table of SPI (containing the master list of Instruments) and the various related tables. Schema Virtualisation removes this complexity, users can treat the various related fields like there are actually part of the primary table and leave APS Reporter/SP to work out how to relate the additional tables. The result is the Query shown on the right above, and in the image below:
Virtualised Component Table
APS Reporter/SP also provides meaningful names for the fields, rather than the cryptic database field names selected by the developer. The resulting SQL Statement is also much simpler:
Select
  COMPONENT.CMPNT_NAME,
  COMPONENT.CMPNT_SERV,
  COMPONENT.CMPNT_FUNC_TYPE_NAME,
  COMPONENT.CMPNT_SYS_IO_TYPE_NAME,
  COMPONENT.CMPNT_LOC_NAME,
  COMPONENT.CMPNT_MFR_NAME,
  COMPONENT.CMPNT_MOD_NAME,
  COMPONENT.LOOP_NAME,
  COMPONENT.LOOP_DWG_NAME,
  COMPONENT.CMPNT_FUNC_TYPE_DESC,
  COMPONENT.PID_DWG_NAME
From
  COMPONENT
(Note of course that this SQL Statement could not be used in any other application, it would not know how to resolve the various field names into their respective tables)

For SPI, the improvement is dramatic, but for SPF it is simply amazing.

The SPF database does not contain tables that most users could recognise. For example, if you're looking for a list of documents, there is no Document table. In fact, there is no table representing any particular object type, and writing a query to extract even a simple list is incredibly complex (so much so, I don't even have an example to show you...perhaps someone could volunteer one?).

On the other hand, the Schema Virtualisation in APS Reporter/SP lets us work with the SPF database just like it was a conventional relational database! We can ignore the underlying complexity, and just select what we need:
SPFDesignDocMaster Vitualised Table
In this case we're working the the SPFDesignDocMaster Class in SPF, Virtualised as a Table. The SQL Statement is as follows:
Select
  SPFDesignDocMaster.SPFDesignDocMaster_Name,
  SPFDesignDocMaster.Description,
  SPFDesignDocMaster.SPFDocCategory_Desc,
  SPFDesignDocMaster.SPFDocState,
  SPFDesignDocMaster.SPFTitle,
  SPFDesignDocMaster.SPFDocType
From
  SPFDesignDocMaster
Again, the Query Builder provides us with clear descriptions for each field, and handles the translation to SQL.

With the Virtualised Schema available, the full power of SQL can be used. Sophisticated Queries can manipulate the data in creative ways to extract just the information needed. For example, in SPI, producing an Instrument Index which includes the I/O data for each tag, or the corresponding Junction Box & Marshalling terminals, becomes simple.

Another example, an SPF client had a problem. Their old Document Control system had a huge number of scanned hard-copy documents in TIF format, and when they were migrated to SPF a copy of the scanned document in PDF format was to be added to each document. After a short time they discovered that a few of the documents were missing the PDF file, but they had no idea just how many. Somehow they had to identify just how many documents were affected (how big was the problem), and get a list of the documents they had to fix. Extracting such a report from SPF would typically require some specialist skills and possibly some custom software, but a straight-forward SQL Statement answered the question in a few minutes (a few hundred documents).

This is the key to a good reporting tool, and it should be expected from any database tool. Providing quick and easy access to the data for the routine reports required regularly, plus also the one-off adhoc queries that come up from time to time.

You're not getting full value from your data while it is hidden inside a complex database that prevents access. Having the full power of SQL available, while avoiding the complexity of the underlying database, allowing you to access your data anyway you wish, is the key.

In this post I've had a deeper look at some specific examples of working with the SPI & SPF databases, and highlighted the advantages of working with Schema Virtualisation. Next time I'm going to have a look at some of the clever things that can be done with a good reporting engine, including exception reporting to identify errors in data, and some statistical reporting for analysis and KPI's.

Regards,

Brian.
View Brian Raynor's profile on LinkedIn

No comments:

Post a Comment

UA-38428324-2