Saturday, March 2, 2013

SPF as a Source of Data

I'll apologise in advance, this is going to turn into a long post with lots of text and some obscure concepts, but I promise it will be worthwhile. I'm already planning for this to be the first in a trilogy covering APSR and the SPF database (sort of an APSR: A New Hope), with this one setting the stage and the next couple diving deeper into the technical stuff. If after reading this you have any comments or would like any further information, please feel free to use the comment box below, or contact me through one of links to the right.


When we're designing reports we work with a variety of databases and mostly it's simple enough to connect to the database and identify the major tables, find the fields we need for a report, and build a query to extract the data. Of course, some databases are more complex (eg SPEL, SPPID) and even relatively simple reports require a good understanding of the underlying database, but it can be done without too much trouble.

Most databases are highly normalised and are optimised for performance. This usually means they have many tables, in a highly structured form, where each table has a unique key (id) field. This allows the database engine (Oracle or MS-SQL Server) to find the most efficient way to retrieve data when requested by the application.

Unfortunately, the trade-off for us is that a highly normalised and optimised database is usually more complex to produce reports. The required data fields are spread across a lot of tables, and the designer needs to understand how they're all related.

SPF Is Different

Then there are databases like SPF. If you've ever looked at the SPF database, at first glance it doesn't seem too complicated as there are just a handful of tables, however the complexity of the SPF database lies not in the number or structure of the tables but in how the data is stored in the tables. A single application Object is not stored as a row in a table, but is spread across many rows and often many tables. This design is not optimised for performance, instead it is designed for flexibility, allowing the SPF Developer to add new Classes, Interfaces and Relations without modifying the underlying database structure. This allows SPF to support a catalog of 2000+ Classes, with 70,000+ Properties, with less than 20 database tables.

For the Report Designer, however, this structure is a major problem. Extracting meaningful reports directly from the SPF Database is incredibly difficult, and those who have tried do not always succeed. Just working out how the data is stored is a challenge, but then creating SQL Queries to piece together all of the required data requires specialist skills. It is possible to manually write queries that provide simple lists of data, such as a document list, or a tag list, but creating queries that join the data together in meaningful ways is almost impossible.

For example, if you were working with a conventional relational database, and you wanted to get a list of documents that were associated with transmittals that are overdue, it typically wouldn't be very complex (probably two or three major tables, plus some supporting tables). A similar Query from the SPF database would be enormously complex, and even if it worked, just validating the query would be difficult (how many people could even understand it!).

This is one of the trade-off's made by the Intergraph developers in implementing SPF, it would be flexible, but working with the database would be complex. SPF users cannot use conventional database tools, like Toad or Crystal Reports, to extract data or create reports, they are almost cut-off from their data and must use the SPF software itself for all queries and reports. Standard SQL queries became impractical.

Simplifying Databases

APS Reporter/SP was designed to simplify the creating of Reports, by making it easier to extract the data (query), easy to produce either simple tabular reports, or complexly formatted reports, and easier to keep revisions, add title blocks, etc.

The Queries are simplified by using a technique known as Schema Virtualisation, which uses a Model of the database that is optimised for Reports. The user doesn't work directly with the underlying database, they work with a Virtual Model that appears to be a standard relational database with Tables and Fields that are easy to understand (but not necessarily optimised for the database server).

The first Virtualiser we developed was for SmartPlant Instrumentation. SPI has a database that is relatively large (300+ tables), but is not overly complex. It's not a very difficult database to use for reports, but often even a simple report (such as an Instrument Index) can require 10 or more tables, and complex queries could include many more. Using the SPI Virtual Model the same query can include just 1 table, and complex queries often just 3 or 4.

SPI Queries using the native database, and the Virtual Model
The SPI Virtualiser was a great success. It creates a model of the SPI database that is very similar to the original, so users who have mastered the SPI database structure can use that knowledge, but it simplifies all of the common relationships. Novice users can easily generate simple queries , and just as importantly, expert users can leverage the Virtual Model to create complex queries much more easily. The SPPID & SPEL Virtualisers were similarly successful.

But what about SPF?

SPF was challenging. Could we produce a Virtual Model that made it easy to create queries, with no knowledge of the SPF database structure?

Initially, we tackled SPF 3.8 (Business Objects), and it worked beautifully. Users could build SQL Queries, working with the various Business Objects, while completely ignoring the cryptic table names and field names of the native database.

The real challenge was SPF 4.x. The Virtual Model had to fully support the ClassDef's, InterfaceDef's and RelDef's of the Schema model, allowing Report Designers to write Queries using the Objects they were familiar with. It also had to adapt to each client's database, including all of the custom schema objects defined in each system, and for the custom objects to work seamlessly with the out-of-the-box objects.

SPF Query with DocMaster & DocRevision
The SPF 4.x Virtual Model demonstrates the power of Schema Virtualisation. The native SPF database is completely hidden from the user, while Virtual Tables representing of Classes, Interfaces and Relations are available to build SQL Queries. For example, the query above joins the SPFDesignDocMaster and SPFDesignDocRevision Virtual tables, using the RelDef SPFDocumentRevisions.

It's now possible to write quick and easy adhoc SQL queries for the SPF database using standard SQL syntax. Complex queries can be written without any specialist knowledge of the SPF database structure, and Reports can be produced in a fraction of the time usually required.

In the next post I'll look at the process of writing queries for SPF with some tips and tricks to make life even easier.

If you have any comments, please feel free to use the comment box below. If you would like any further information, a free trial license, or a demonstration of APS Reporter/SP, please feel free to use the links to the right.

No comments:

Post a Comment

UA-38428324-2