Friday, July 26, 2013

Loading Data into SPF

SPF Projects often require large volumes of legacy data to be imported, and this initial loading can account for a significant portion of the time and expense of establishing new SPF systems. With staff involved in a variety of SPF projects, we believed that the data migration process could be significantly improved using our reporting application, APS Reporter/SP. In this blog post I discuss some of the new features of APSR for generating SPF Load Files directly from source data.


In addition to producing Reports for various SmartPlant applications, and general databases, APS Reporter/SP can also be used to extract data in a variety of formats (CSV, XLS, HTML). Recently we have added support for the file formats used to load data into SPF (XML & NLF).

As a result, using APS Reporter/SP can simplify the data conversion process and significantly reduce the time required to process and generate the SPF Load Files (sometimes by a factor of 10 or more!).

The Usual Approach

Most SPF Implementations use a similar procedure to process and import data into SPF.

Typically, Data is:
  1. Extracted from the source databases into a standard file format (often Excel)
  2. Processed by Excel macros or custom utilities to map the source data to the required SPF structure (Classes, Interfaces, Properties, Rels)
  3. Exported to the standard SPF Load File 
  4. Imported by the SPF Loader.
Generating SPF Load Files using Excel
This approach can be quite time consuming, both in the time required to setup and configure the process, and also in the time required to actually execute the conversion.

Setting up for the conversion requires analysing the source data and developing appropriate routines (in Excel macros or another tool) to process the data into the structures required by SPF. This can take a considerable amount of time, often weeks or months, however it is not always time-critical as it can usually be performed in parallel with other project setup tasks.

The actual conversion process requires the data to be extracted to the Excel File, processed by the Excel macros exported to the SPF XML Load Files, and imported by the SPF Loader. This process can be very time-critical as conversion from a live system requires that updates be suspended in the live system, and not resumed again until the conversion to SPF is complete. Unfortunately, Excel is not a high-performance data processing tool, and for large volumes of complex data, the process may take many days, leading to unacceptable downtime of critical systems.

There's not much that can be done about the performance of the SPF Loader and importing of XML files into SPF, however the other steps can be dramatically improved.

Load Files from APSR

Removing Excel from the procedures, and using APS Reporter/SP to process the data and generate the Load Files, can provide a huge improvement in conversion processing time.

When using APSR, Data is:
  1. Extracted from the source databases, processed to map to the SPF Structures, and exported into the SPF Load File (single step)
  2. Imported by the SPF Loader.
Generating SPF Load Files using APS Reporter/SP
This process is both conceptually simpler, and also leads to dramatic performance improvements.

Setting up the conversion involves designing appropriate SQL Queries, VB Scripts and Grid Layouts to process the source data into the structures (Classes, Interfaces, Properties, Rels) required by SPF. This is still the most time consuming part of the process, but can be much simpler than the Excel approach.

The actual execution of the conversion is where the real performance improvements can be found. Source data is extracted, processed and exported to SPF Load Files in a single step. Overall, performance to generate the SPF Load File can sometimes be improved by a factor of 10 or more!

Of course, the performance improvement also has other additional benefits. Testing of the end-to-end process is much faster, alternative approaches can be evaluated, and changes/fixes can be implemented quickly.

Data Conversion

Rarely does the source data map smoothly and easily into the SPF Data Model, some restructuring & manipulation is almost always required. APSR provides various tools to manipulate the data, including the initial SQL Query extracting the data from the source, VB or C# Scripts which can manipulate the extracted data as required, and Calculations in the Grid for final formatting.

Mapping to Classes, Interfaces, Properties & Rels

The key to producing SPF Load Files is to map the source data to the required objects in SPF. APSR makes this very easy using the Grid window and a special Band structure, as shown below:
Grid layout for SPF XML Load Files
The Bands across the top of the data columns represent the SPF Load File Container, the Classes, Rels and Interfaces. Each column represents a Property.

Container, Class, Interface & Properties
The Screenshot to the right shows:

  • Class APSRDocumentMaster 
  • Interface IObject 
  • Properties UID & DWG_NAME 




SPF Load File Generation

Once the Grid Layout is complete for the required file format (XML or NLF), the Load Files can then be generated.

If the XML file format is used, APSR can automatically split the output based on the Container properties (ie for multiple containers to be generated).

For both XML & NLF formats, APSR can split the output files based on the Row Count of the Grid. For example, a new output file could be created for every 1000 rows of data.

Results

When executed, the expected Load Files are produced:
Sample SPF XML Load File from APS Reporter/SP

Sample SPF NLF Load File from APS Reporter/SP

Scheduled Processing

In addition to bulk importing large volumes of data when migrating to SPF, this approach can also be used to perform scheduled imports automatically. APS Reporter/SP provides a scripting mechanism for generating SPF Load Files, which can then be scheduled with the Windows Scheduler. Linking to the SPF Loader can provide a fully automated process.

SPF to SPF

Finally, a note about a special case. Occasionally there is a requirement to migrate data from one SPF system to another. Fortunately, APS Reporter/SP makes this particularly easy as the SPF Virtualisation in APSR allows SQL Queries to be written for the source system, which then usually map directly to the destination system.
Refer to my previous blog entry discussing SPF as a source of data for a discussion of SPF Data Virtualisation.


For more information, contact me at one of the links to the right. Also, the SPF Load File Generation process is discussed on the APS Reporter/SP Wiki (still very much a work in progress!).

Regards,

Brian Raynor.
View Brian Raynor's profile on LinkedIn

No comments:

Post a Comment

UA-38428324-2