The Pemberton Family World Wide

Pemberton Puzzlers Piecing Pedigrees into the Perfect Pemberton Picture

  • Increase font size
  • Default font size
  • Decrease font size

Genealogical Data Quality

E-mail Print PDF

The Convenience/Accuracy Tradeoff

willStrikeover

This is an unfortunate but entirely natural circumstance and one that we all need to keep in mind as we gather our historical data. How's how it is:

The most accurate records are the originals, but they are also the most inconvient if not downright inaccessible. In order to be efficient, millions of records have been extracted from the originals. Almost without exception, these extractions drop vital information and clues that exist in the original document. As the data is processed further to make it easier to search, it undergoes further degradation, even in the best of circumstances. For example, your author was involved in a small genealogical project using baptisms from 20 adjacent English parishes. The data was independantly extracted by two differenct individuals and the computer was used to match and merge these two data streams. Mismatches were directed to a third person who had more expertise in reading old documents and a decision was made as to how the record would read from then on. These decisions are often judgement calls, there is no way around that fact. So as the data is made more accessible and more convenient it is more or less corrupted.

This underscores the absolute necessity of going back to the most original record available before deciding to adopt its data. Even then, there will be errors as described below:

  1. The testator doesn't know or fabricates facts to the recorder
  2. The recorder doesn't understand, or incorrectly interprets the words of the testator, or records later from a flawed memory, or his personal mores bias or corrupt his recording
  3. The recorder inadvertantly mixes data from two events
  4. The recorder makes an error while writing: omission, exchanges, wrong dates, wrong names ("I thought the father's name was George not Gregory."), etc.
  5. The extractor/indexer misreads or can't read the record, or when in doubt knowingly adds his own idea of "how it must have been"
  6. The extractor/recorder makes a copying error. (We know by statistical analysis that this is usually about 2% of the data.)
  7. Computer programming errors drop, add or mix data.
  8. Data presentation to the genealogist is counter intuitive and is misinterpreted
  9. The genealogist makes a copying error
  10. The reader misreads
Last Updated on Tuesday, 15 March 2011 11:09  
He that demands tolerance proclaims therewith his own intolerance.
Please Tell Us Why You Registered!
 

Pemberton is a one-name study with
Guild of One-Name Studies



Federation of Family History Societies
You may subscribe to their eZine and/or News Bulletin.


My Shopping Cart

Your cart is empty

Online Users

0 users online