{"id":145,"date":"2010-07-30T22:41:03","date_gmt":"2010-07-30T17:41:03","guid":{"rendered":"http:\/\/www.papaworx.com\/Blog\/?p=145"},"modified":"2011-08-27T20:57:05","modified_gmt":"2011-08-27T15:57:05","slug":"a-genealogical-data-model","status":"publish","type":"post","link":"https:\/\/www.papaworx.com\/Blog\/2010\/07\/30\/a-genealogical-data-model\/","title":{"rendered":"A Genealogical Data Model"},"content":{"rendered":"<p>Sooner or later, most engaged genealogists outgrow their paper napkin diagrams, typed charts and even filing cards as\u00a0preferred method of record keeping; they move on to some form of genealogical software. In my case, that step happened relatively early. I did a survey of freeware and\u00a0commercially available software and was not impressed. No package offered all the features I desired, and none had adequate extension capabilities. Obviously, I had to design my own. But to tell the truth, necessity wasn&#8217;t the only reason for my DIY choice. Designing software gives me as much pleasure as watching porn movies or propagating rare orchids may give to others.<\/p>\n<p>The die was cast. I would write my own. But what about context? I wanted the program for myself, to meet my needs, and I wasn&#8217;t going to commercialize it. It had to run on a PC platform; after all, real men don&#8217;t use Macs! My choice for programming language was between C++ and Visual Basic. I am too lazy to keep track of pointers. So I was pretty much limited to VB6 for the first version.<\/p>\n<p>The next big issue was the basic data\u00a0model for genealogical records. Obviously, my research led me to <a href=\"http:\/\/en.wikipedia.org\/wiki\/GEDCOM\" target=\"_blank\">GEDCOM<\/a>, version 5.5.2 . GEDCOM goes back to the 1980s, when it was developed by the LDS church for public ancestry files. It has evolved since and become the de facto standard for serialized genealogical records.\u00a0 For several years there was talk\u00a0of GEDCOM XLM, but it never made a mark; and why should it? GEDCOM 5.5.2 is perfect &#8211; perfect for serializing genealogic records!<\/p>\n<p>And\u00a0there&#8217;s the rub. Efficient software design involving large data sets requires a <a href=\"http:\/\/en.wikipedia.org\/wiki\/Relational_database\" target=\"_blank\">relational database<\/a>. Yes, GEDCOM can be forced into the Procrustes bed of relational tables. But basing a genealogical\u00a0program entirely on the GEDCOM structure means sacrificing execution time &#8211; big time. Drawing a kinship tree\u00a0iterating through the basic GEDCOM\u00a0model is like sucking treacle through a straw. On the other hand, taking a steam roller to GEDCOM to force the data into a flat table is no answer either. Resulting rows would contain hundreds of columns, most of them empty.<\/p>\n<p>This conundrum gave me quite a few sleepless nights. But my epiphany arrived just in time: I would keep the information in a GEDCOM structure\u00a0in the form of\u00a0a simple, three column table corresponding to the three columns of a GEDCOM record: &#8220;Level&#8221;, &#8220;Tag&#8221; and &#8220;Value&#8221;. Actually, I lied. There are four more columns to make things work: &#8220;Node&#8221;, &#8220;Parent&#8221;, &#8220;Root&#8221; and &#8220;Cheat&#8221;. Each\u00a0record has a unique index number called &#8220;Node&#8221;. To maintain the hierarchical structure of GEDCOM, any record can act as a parent to child records. Child records carry the node number of their parent record in\u00a0their parent field. In the context of genealogy, this\u00a0terminology can be confusing. &#8216;Parent&#8217; and &#8216;Child&#8217; refer only to a relation between records, not people. Any record with level zero is considered a root record. Its value goes into the &#8220;Root&#8221; field of itself and all its descendants. That leaves us with the &#8220;Cheat&#8221; field. It is a boolean, usually &#8216;false&#8217;. I wanted to leave myself the option of cheating, i.e. to use non LDS-sanctioned tags. But I wanted to cheat honestly by\u00a0signalling where a tag wasn&#8217;t &#8216;kosher&#8217;\u00a0with a &#8216;Cheat&#8217; value of &#8216;true&#8217;!<\/p>\n<p>The last ingredient to the data model are the two index tables, one for individuals, the other for families. But having the same data in more than one table of a relational database would probably cause <a href=\"http:\/\/en.wikipedia.org\/wiki\/Edgar_F._Codd\" target=\"_blank\">Edgar Codd\u00a0<\/a>to turn in his grave. In fact, you can&#8217;t even call the database relational. OK, so I call it pseudo-relational. There is a\u00a0price, though, I have to pay for this act of transgression: every transaction involving writing to a root record or changing a family composition\u00a0must pass through a compulsory\u00a0integrity check. One can also wrap this operation into an SQL transaction.\u00a0But it is a small price to pay.<\/p>\n<p>The program worked like a charm. It was robust, fast and could perform all the functions I wanted. Since then I have written a second version in C#. It is even faster and easier to maintain and extend. It runs on all our networked computers. My wife and I both use it simultaneously. The database runs in an MS-SQL-Express server on one of our machines.\u00a0When we travel, we\u00a0create a temporary copy of the database on the laptop.<br \/>\n<!--nextpage--><!--pagetitle:Bragging about ChainPro--><\/p>\n<h3>Bragging rights<\/h3>\n<p><a href=\"http:\/\/www.papaworx.com\/Blog\/Big\/Indi_big.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignright size-medium wp-image-158\" title=\"Indi_small\" src=\"http:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Indi_small-300x220.jpg\" alt=\"\" width=\"300\" height=\"220\" srcset=\"https:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Indi_small-300x220.jpg 300w, https:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Indi_small.jpg 390w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>Allow me to brag a little about our private genealogical database manager. When Chainpro comes up, it shows the individual page of the default person. The default person can be selected and stored. It shows not only the vital data but gives alo the closest relatives in appropriate drop downs. When one of these is selected, the page switches to this person. There is a large text box for biographical data. Any number of documents, be they pictures, pdf files, videos or URLs can be attached and watched.<br style=\"clear: both;\" \/><br \/>\n<a href=\"http:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Fam_big1.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignright size-medium wp-image-161\" title=\"Fam_big\" src=\"http:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Fam_big1-300x219.jpg\" alt=\"\" width=\"300\" height=\"219\" srcset=\"https:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Fam_big1-300x219.jpg 300w, https:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Fam_big1.jpg 773w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>On each page there\u00a0are two\u00a0&#8216;Family&#8217; buttons, one\u00a0each for the partner displayed and the parents. It provides information on the nuclear family in question, including dates and places of engagement, marriage etc. It also shows the children in this nuclear family.<\/p>\n<p><a href=\"http:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Tree_big.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignright size-medium wp-image-162\" title=\"Tree_big\" src=\"http:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Tree_big-300x214.jpg\" alt=\"\" width=\"300\" height=\"214\" srcset=\"https:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Tree_big-300x214.jpg 300w, https:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Tree_big.jpg 775w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>\u00a0Besides the individual page there is also a kinship page available that shows an ascendant\u00a0or descendant family tree. Nodes are expandable, either individually or globally, to show as much of the tree as desired.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignright size-full wp-image-163\" title=\"Find\" src=\"http:\/\/www.papaworx.com\/Blog\/wp-content\/uploads\/2010\/07\/Find.jpg\" alt=\"\" width=\"164\" height=\"170\" \/>ChainPro also has extensive &#8216;Find&#8217; capability to find a person by menu assisted name searches, by dates or place or by involvement in the holocaust.<\/p>\n<p>As I said, we like the program, we use it extensively, and it is not for sale or distribution. I am simply showing it here to brag and to illustrate the power inherent in the data model described.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A data model is described for genealogical software which combines the optimal features of the GEDCOM data structurem with those of a relational data base. The model is illustrated with a proprietary genealogical database manager. <a href=\"https:\/\/www.papaworx.com\/Blog\/2010\/07\/30\/a-genealogical-data-model\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"yes","footnotes":""},"categories":[4,3],"tags":[],"_links":{"self":[{"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/posts\/145"}],"collection":[{"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/comments?post=145"}],"version-history":[{"count":37,"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/posts\/145\/revisions"}],"predecessor-version":[{"id":152,"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/posts\/145\/revisions\/152"}],"wp:attachment":[{"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/media?parent=145"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/categories?post=145"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.papaworx.com\/Blog\/wp-json\/wp\/v2\/tags?post=145"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}