Skip to main content

CollectiveAccess

Summary

  1. Resolve the 21 collision clusters via physical survey. Assign final identifiers to the 68 affected records.
  2. Build the data model in CA: object type list, metadata elements, entity relationship types.
  3. Build the vocabulary lists in CA: admc_material_category (hierarchical) and admc_material_terms (flat), built from cleaned and merged DSpace subject fields.
  4. Build the storage location hierarchy in CA.
  5. Pre-process the export: parse dc.description into holdings flags, split subject strings on ||, map rights to access values, load your new idno column.
  6. Write the import mapping: two passes, entities first then objects.
  7. Test import on a 20-record slice. Iterate.
  8. Production import of the 310 clean records.
  9. Import the 68 collision records after physical survey.
  10. Establish media naming convention for future digitisation; no media ingest yet.

Data model  →  Lists/vocabularies  →  Import mapping  →  Test import  →  Production import  →  Media (later)

Resources

https://manual.collectiveaccess.org/providence/user/editihttps://camanual.whirl-i-gig.com/providence/user/dataModelling/listsAuthoritiesng/lists_and_vocab.html
https://manual.collectiveaccess.org/providence/user/dataModelling/primaryTables.html

Primary tables

CA is structured around several primary tables, with editors that can be enabled or disabled depending on project requirements. The ones most relevant to the ADMC:

Table What it holds
ca_objects The physical or born-digital items themselves
ca_object_lots Accession events grouping multiple objects acquired together
ca_entities People and organisations (creators, donors, manufacturers)
ca_collections Intellectual groupings of objects (series, fonds, donor collections)
ca_storage_locations A hierarchical map of physical storage
ca_places Geographic locations (hierarchical, linkable to GeoNames)
ca_occurrences Flexible: events, exhibitions, publications, activities
ca_loans Outgoing or incoming loan records
ca_movements Optional: object movement history

An object record does not contain the entity's name as a text string. It contains a relationship to a separate entity record. This is the relational model in practice. If "Hansgrohe AG" is a manufacturer of 40 objects in the ADMC, there is one entity record for Hansgrohe, and 40 relationships from 40 objects to that record. Correct it once; it updates everywhere.

Metadata elements to create (or verify)

CA element code Maps from Data type Notes
admc_idno_legacy dc.identifier.other Text Preserve the old location code as a legacy field, non-searchable by default
admc_description dc.description (substantive only) Text (long) Only for the ~20 records with real text
admc_physical_holdings dc.description (parsed) List (multi-value) "Sample available," "Booklet available" as checkboxes
admc_manufacturer_url dc.publisher.uri URL Product page; can also live on the entity record
admc_subject_classification dc.subject.classification List (hierarchical) See vocabulary design below
admc_dspace_handle dc.identifier.uri URL Preserve the DSpace handle for provenance
admc_intake_year dc.date.issued Text or Date Record the upload year if at all; do not call it "date issued"

VRA Core fields that should already exist and need no new elements: title (preferred_labels), dimensions, material, technique, condition.

Sandbox -- AS