|
The GEDCOM Testbook Project
The purpose of this exercise is to evaluate the ability of a number of programs to transfer the results of a typical genealogical search using the current GEDCOM 5.5 specification as available on the GEDCOM ftp site and dated December 11th 1995. There have been a number of draft versions since but this version is considered the official version by the Family History Department of the LDS Church.
Phase One:
A story representing three generations, complete with source information, was created and entered into several popular programs. A series of reports were then produced. The database for each program was then exported as a GEDOM file. These files were imported into the other programs in the test group and the original report formats generated. The resulting differences in the reports were then documented. The results confirmed problems existed and indicated further research was required.
Phase Two:
The intent of phase two was to identify precisely where the data transfer problems originated. This required the use of a common data set in all programs in the test group and a control file, to be used as a standard when evaluating the import capabilities of each program. The GEDCOM files create by the test programs would be evaluated using the GEDCHK program from the LDS Family History Department. It reports on tag exceptions, syntax errors, cross reference errors and identifies tag extensions created by a software developer.
The common data set provided a list of names, events, dates, sources, repositories and the submitter information. This was intended to insure consistent entry of the facts. When data entry was complete the volunteers submitted the GEDCOM output files to the Project Leader for evaluation.
The control file is based on the GEDCOM grammar as formulated by Jed Allen of the LDS Family History Department for use with the GEDCHK program. It uses the test data and all but a few of the legitimate GEDCOM tags. While it does not use the full GEDCOM grammar structure it does include a representative sampling. Naturally it passes the GEDCHK program test. It should not be perceived as the "correct" version as some of the GEDCOM 5.5 specifications are open to interpretation.
Four programs were selected for this phase, Generations Grande Suite 8, Family Tree Maker 7.5, Master Genealogist 4 Gold, and Ultimate Family Tree 3. The data was entered by volunteers and Gedcom files created. These files were then imported, along with the control file to the other programs in the test group. The results were interesting. In addition to the expected problems it was discovered that two individuals, using the same data set and the identical program could create two slightly different GEDCOM files. Another difficulty is created by the developers who, for varying reasons, offer export and import options. The test results are documented in summary form but, unless the reader is conversant with the GEDCOM specification, are not particularly useful.
Phase Three:
The results of phase two make it evident that it is impractical to attempt to document all the possible combinations when using a GEDCOM file to transfer information between programs. It is also obvious a more useful reporting method is required. Phase three is an attempt to address these concerns.
The test data is now input by one person. This ensures consistent data entry. A test scorecard, based on the complete GEDCOM 5.5 grammar, is used as a base to evaluate the importation of the control file and the exportation of a program's Gedcom file. The results are presented in numerical and a percentage format. A detailed description of how the scoring is done is available by clicking on the Chart Desc. button. Instructions for interpreting the results are linked to the Chart Usage button. Clicking on the Chart button will display the scorecard summaries in a textual table and also show three charts comparing the nine tested programs for import, export and a combination of both in a percentage format. The text summaries can be read by clicking on the Summaries button. The scorecard tables are MS Excel 2000 files and can be downloaded from the chart page. They are approximately 120k each. The story line and the data used for the test can be accessed using the Story and Data buttons.
© 2001 Gentech Inc.
Last Updated 22 July 2001
|