More than a decade ago, Thermo
Fisher Scientific's Galactic business unit was
established on the premise that data interchange formats such as JCAMP would allow
scientists to easily move data from an instrument vendors software into
cost-effective PC-based workstations for efficient data sharing, visualization, and
processing. However, early implementations of these file formats were poor, inefficient,
and cumbersome. So Thermo Fisher took it upon
itself to facilitate the translation of
analytical data formats into the Thermo Scientific SPC data format
by developing a file converter technology. This was an immediate success and
enabled users to easily bring data from multiple instruments into a common format. This
common format, combined with the powerful processing capabilities of Thermo
Scientific software,
allowed users to easily share data and solve problems as never before.
In the Beginning
This file converter technology first appeared in Thermo Fishers original DOS product
Spectra Calc. File converters were small programs which were called by the main
application to convert "foreign" file formats into a format that it could read.
The advantage of this approach was that new or updated converters could easily be added to
an existing installed product. As vendors introduced new software for their instruments,
Thermo Fisher worked hard to make sure Thermo Scientific software was compatible with customers
data files. From this, a library of file converters supporting most of the popular
analytical instruments began to grow. When Microsoft introduced Windows 3.0, Thermo
Fisher
migrated the converter technology into the GRAMS product line and continued to update the
converter library.
Although the development of the original file converter technology is an
impressive feat, there were some drawbacks. First and foremost, the user had
to know what format the data was in and manually select the appropriate
converter - not always an easy task if the user did not collect the data.
Then, in a separate step, the user had to import the data into a Thermo
Scientific SPC file and then open it. Some early attempts in GRAMS to automate the
process allowed the user to select a limited number of file converters that would appear
in the File / Open dialog box. The "Files of Type" field would show the
converters and when a file was opened, it would call the specified converter
automatically. This worked to some degree but did not allow all the converters to be
invoked and depended only on the file extension as a unique identifier for the file type.
The Countless Possibilities
The ultimate goal of Thermo Fisher was to have a product that could automatically identify and
convert data files without any input from the user. In order to achieve this goal, it
would be necessary for the software to uniquely identify data files. Thermo
Fisher engineers
examined the Thermo Scientific file format archives and tried to identify the parameters that could
be used to easily distinguish among all the existing data file formats. This was not a
trivial task. Every potential solution had its share of drawbacks. One possibility was to
use the 3 letter file extensions as a unique identifier. This would work for some
instrument vendor data formats. However, many instrument vendors use the same file
extension such as .DAT. Then also, many other vendors allow for any extension or no
extension at all. It is also likely that other non-scientific applications use the same
extensions that many instrument data files use. Another mechanism they explored was to
perform some sort of file check. This involved opening the data file itself and looking
for a unique identifying pattern. For some formats, this was relatively easy as the files
contained identifiers such as "This is vendor XXX file" or a unique pattern of
bytes. In others, it was much more difficult, and sometimes impossible to find a unique
file "fingerprint". Other methods involved actually attempting to convert the
file and if it failed, to assume it is not the correct file format.
A New Technology
Through all their findings, Thermo Fisher engineers determined that they needed to build a
software component that would utilize a database of hierarchical rules that combined all
of the above techniques for identifying data file types. Within the database, there would
be a set of file checking rules for each and every data type. This database can be easily
extended and updated as new converters are introduced or older ones are updated. When a
user selects a data file to load, the rules stored in the database are applied to the data
file to automatically identify the data type. The file is then automatically converted
into SPC format and loaded into the application. This operation is extremely fast so that
loading and converting most data files seem instantaneous.
This technology is called SmartConvert, a software component that can automatically
identify a data file, select the correct file converter, and convert it for loading into
any Thermo Scientific software application. Users can see the results in GRAMS/32 (Version 5.1 or
higher) or Spectral ID. One can select any file on one's system to open and SmartConvert
will identify, convert, and load it into the Thermo Scientific application.
It’s as if the file were already in the Thermo Scientific SPC format. With SmartConvert, Thermo
Scientific has essentially
"normalized" analytical data once again, but this time in a more powerful and
convenient way for the user.
Indeed, the ability to read hundreds of analytical instrument data formats is one of
the most attractive features of Thermo Scientific software. From the start, this capability
revolutionized the way laboratories work with analytical data in a seamless fashion. The
library of over 150 formats we support today is the cumulative result of years of
continuous effort to update and maintain the file converters. Like the early file
converter technology, SmartConvert will continue to improve and expand its library.
Thermo Fisher has invested many years of programming to provide an efficient, open, universal
file format for analytical data and is committed to continue to provide new and innovative
solutions. |