Overview

The schema for the entities that actually collect, store and retrieve Assesment data parallels the hierarchical structure of the Metadata Data Model. In the antecedent "complex survey" and "questionnaire" systems, this schema was simple two-level structure:

This suffices for one-shot surveys but doesn't support the fine granularity of user-action tracking, "save&resume" capabilities, and other requirements identified for the enhanced Assessment package. Consequently, we use a more extended hierarchy:

To support user modification of submitted data (of which "store&resume" is a special case), we base all these entities in the CR. In fact, we use both cr_items and cr_revisions in our schema, since for any given user's Assessment submission, there indeed is a "final" or "live" version. (In contrast, recall that for any Assessment itself, different authors may be using different versions of the Assessment. While this situation may be unusual, the fact that it must be supported means that the semantics of cr_items don't fit the Assessment itself. They do fit the semantics of a given user's Assessment "session" however.)

Note that all these entities derive from the CR, they are also all acs_objects and thus automagically have the standard creation_user, creation_date etc attributes. We don't mention them separately here.

Also, while this doesn't impact the datamodel structure per se, we add an important innovation to Assessment that wasn't used in "complex survey" or questionnaire. When a user initiates an Assessment Session, an entire set of Assessment objects are created (literally, rows are inserted in all the relevant tables as defined by the structure of the Assessment). Then when the user submits a form with one or more Items "completed", all database actions from there on consist of updates in the CR, not insertions. (In contrast, the systems to date all wait to insert into "survey_question_responses", for example, until the user submits the html form.) The big advantage of this is that determining the status of any given Item, Section or the entire Assessment is now trivial. We don't have to see whether an Item Data row for this particular Assessment Session is already there and then insert it or else update it; we know that it's there and we just update it. More importantly, all of our reporting UIs that show Assessment admins the current status of users' progress through the Assessment are straightforward.

We distinguish here between "subjects" which are users whose information is the primary source of the Assessment's responses, and "users" which are real OpenACS users who can log into the system. Subjects may be completing the Assessment themselves or may have completed some paper form that is being transcribed by staff people who are users. We thus account for both the "real" and one or more "proxy" respondents via this mechanism.

Note that we assume that there is only one "real" respondent. Only one student can take a test for a grade. Even if multiple clinical staff enter data about a patient, all those values still pertain to that single patient. 

One final note: we denormalize several attributes in these entities -- event_id, subject_id and staff_id. The reason for putting these foreign keys in each row of the "data" is to produce a "star topology" of fact tables and dimension tables. This will facilitate data retrieval and analysis. (Are there other dimension keys that we should include besides these?)

Synopsis of Data-Collection Datamodel

Here's the schema for this subsystem:

Data Modell

Specific Entities

This section addresses the attributes the most important entities have in the data-collection data model -- principally the various design issues and choices we've made. We omit here literal SQL snippets since that's what the web interface to CVS is for. ;-)