SlideShare a Scribd company logo
CSV Import in AtoM
Understanding how to prep a CSV file
for importing descriptions into AtoM
CSV Import Documentation
https://www.accesstomemory.org/docs/latest/user-manual/import-export/csv-import/
CSV templates on AtoM wiki
https://wiki.accesstomemory.org/Resources/CSV_templates
CSV templates in AtoM
Stored in: lib/task/import/example/
Archival descriptions: overview
• 2 main templates: ISAD(G) and
RAD
• Column headers are the same
in each – there are just more in
RAD.
• You could combine columns from the 2
CSVs and it would import successfully –
it just may not display all data on a
single template!
• Column header names are
generic, based on database
names
Header names are generally written in
camelCase:
• scopeAndContent
• Scope and content (ISAD 3.3.1)
• Scope and content (RAD 1.7D)
• physicalCharacteristics
• Physical characteristics and technical
requirements (ISAD 3.4.4)
• Physical condition note (RAD 1.8B9a)
Crosswalking with CSV files
https://wiki.accesstomemory.org/Resources/CSV_templates
In the example CSV files from v2.2 on, we have included the relevant
content standard name and number in the sample data field. This
means you can import the CSV template to produce a sort of
“crosswalk” or key, showing you how fields in AtoM map to the column
headers, and to other standards templates (such as DACS, MODS, DC).
Crosswalking with CSV files
You can preview data in another
standard without having to edit, by
adding a semi-colon and the
standard’s abbreviation to the end
of its URL:
Standard template Add to URL
ISAD(G) ;isad
RAD ;rad
DACS ;dacs
Dublin Core ;dc
MODS ;mods
Crosswalking with CSV files
Character encoding and line endings
AtoM expects CSV files to use:
• UTF-8 character encoding
• Unix/linux style end of line
characters
UTF-8 (short for Unicode Transformation Format – 8-bit) is a
standard for encoding characters (such as letters, numbers,
punctuation, and special characters) in digital formats. UTF-8 is
backwards compatible with ASCII, the standard that predated it, and
has become the de facto encoding standard used on the web and
in email.
“In computing, a newline, also known as a line ending, end of line
(EOL), or line break, is a special character or sequence of characters
signifying the end of a line of text. The actual codes representing a
newline vary across operating systems, which can be a problem
when exchanging text files between systems with different newline
representations” (Wikipedia). These character are not typically made
visible in the document to users.
Here are some of the differences:
• Unix / Linux / FreeBSD / OS X use LF (line feed, n, 0x0A)
• Macs prior to OS X use CR (carriage return, r, 0x0D)
• Windows / DOS use CR+LF (carriage return followed by line feed, rn,
0x0D0A)
Failure to use these can lead to
import errors, such as strange
characters, or thousands of
blank descriptions being
created!
Character encoding and line endings
We strongly recommend
against using MS Excel as
your local spreadsheet
application when preparing
AtoM CSV data!
Microsoft likes to use its
own custom character
encoding (Windows-1252),
and makes it very difficult
to change the default
encoding to UTF-8.
Similarly, Excel tends to use
different line endings, which
can cause import issues.
LibreOffice Calc is an open source spreadsheet application that makes a
good alternative for AtoM CSV preparation.
When first opening a CSV file, it will allow you
to set the character set, separators, text
delimiter, and see a preview of the fields with
your selection to ensure everything is
rendering properly.
Hierarchical data: legacyId and parentId
• Can be used to manage hierarchies within a file
• legacyId values can be arbitrary – alphanumeric values. Should be unique within the
CSV file
• parentId values should use the legacyId value of the target parent record. For
example:
• IF a collection has a legacyId of 001
• AND the next row is a series that should be a child of the collection
• THEN the parentId value of the series should be 001
• If a record is a top-level description (such as a fonds or collection, etc), then leave the
parentId column blank
Hierarchical data: legacyId and parentId
IMPORTANT: Child records MUST appear AFTER (i.e. below) their parent records in the CSV row order,
otherwise the import will fail! Child and sibling order doesn’t matter, so long as the parent row is ABOVE
Hierarchical data: qubitParentSlug
• Can be used to link imported descriptions to existing descriptions in AtoM as children
• qubitParentSlug values should be the slug (or permalink – the unique part of the
URL after the slash) of the target parent description in AtoM
• WARNING: do not use parentId and qubitParentSlug in the same row!
• If both are present, AtoM will default to using the qubitParentSlug
• Can still use a mix of parentId and qubitParentSlug throughout the CSV – just
not in the same row
• All descriptions should still have a legacyId value added
Linking to Accession records on import
If you want to link incoming descriptions to an existing accession record,
you can add a new column to the description CSV, with the header:
accessionNumber
NOTE: if the accession number does not exist, AtoM will create a blank
stub accession with that number! Make sure you add the correct
Accession number :)
Dates and creators (events and actors)
Description edit templates have 3 date fields. The Display date is what the
end user will see - it is free text. The start and end dates must follow ISO
8601 (YYYY-MM-DD, etc) formatting. These fields are used to support
AtoM’s date range search.
• Display date
• Start date
• End date
Dates and creators (events and actors)
During CSV import, Creators and Dates are paired
Use the | pipe character to add multiple creators/dates.
You can use a literal NULL value in your CSV file to keep the spacing correct for
dates without actors or vice versa:
eventTypes: Sets the
type of event. Different
options based on default
template used in AtoM:
• ISAD: Creation, Accumulation
• RAD: Creation, Accumulation, Contribution, Collection, Broadcasting,
Manufacturing, Custody, Publication, Reproduction, Distribution
• DC: Creation, Contribution, Publication
• DACS: Creation, Publication Broadcasting, Record-keeping activity
• MODS: all
Dates and creators (events and actors)
eventActors and eventActorHistories are related to Authority records.
AtoM will first try to find an existing matching authority record, and will link to it if
found. If not, AtoM will create a new stub authority record. Matching criteria:
# Import type
Match on
name
Match on
repository
Match on history Outcome
1 Any Yes Yes (or blank) Yes (or blank) Linked (no updates)
2A
New or
delete/replace
Yes Yes No New actor created
2B Update Yes Yes (or blank) No Linked AND existing history updated
3 Any Yes No Yes (or blank) Linked (no updates)
4 Any Yes No No New actor created
5 Any No Yes (or blank) Yes (or blank) New actor created
Dates and creators (events and actors)
Remember if you pipe values to include more than 1 creator per row, you still need
to pipe across all event fields! Use NULL as a placeholder where needed.
IMPORTANT: AtoM includes creator inheritance! Meaning if you have added a creator at the top level, you
do NOT need to add them in the CSV at lower levels again – in fact this can cause performance and display
issues later! Instead, simply add the dates and use the NULL placeholder for the eventActor and
eventActorHistories fields:
Dates and creators (events and actors)
eventPlaces and eventDescriptions:
These fields are visible only in the RAD and
MODS templates in AtoM.
You could add these columns to an ISAD CSV
template and it would import without errors,
but they will not show in the ISAD template
user interface!
These values also need to be piped to match
up with the correct event when importing
multiple events per row.
Digital object columns in the description CSV
You can import digital objects at the same time you create new
descriptions.
Note that AtoM has a 1:1 relationship between a digital object and an
information object (i.e. archival description) – you can only attach one
object per description.
For local objects on the same server – use the digitalObjectPath
column (you can add it to the CSV if it is not in the template)
Digital object columns in the description CSV
You can also import remote digital objects available on the web, using the
digitalObjectURI column. AtoM will create local derivatives, but will
only store the path to the master, rather than retaining a local copy.
Requirements for a remote digital object import to succeed:
• Must be publicly available on the web (no logins, firewalls, VPN requirements, etc.)
• Must be an HTTP or HTTPS link (FTP links will not work)
• URI must end in the file extension (e.g. .jpg etc. Can’t link to a landing page such
as a YouTube video this way – AtoM needs to be able to fetch the actual object!)
Physical storage columns
Similarly to creators, you can link to or create new physical storage containers and
locations in a description CSV import
• For the import to succeed, all 3 fields must be populated for each row:
physicalObjectName, physicalObjectLocation, and
physicalObjectType
• Can pipe values in a row to add multiple locations
• AtoM will first attempt to match based on all 3 rows – if no match is found, a new
container and location will be created on import
• physicalObjectType is linked to the Physical Object Type taxonomy in AtoM
Culture, language and script fields
• culture: determines the source language of the description in AtoM. We
recommend populating for all rows. Expects ISO 639-1 two-letter language
code values (e.g. en, fr, es, etc.) Must have the culture added to Admin
>Settings > i18n languages for proper display!
• See a list of AtoM supported langs and codes here: bit.ly/AtoM-langs
• language (of material) and languageOfDescription both also
expect ISO 639-1 two-letter language code values. Can accept piped values
for multiple languages, e.g. en|fr|es
• script (of material) and scriptOfDescription expect four-letter
ISO 15924 script code values – e.g. Latn, Cyrl, etc. Can also pipe to enter
multiple values per row. Note that these codes are capitalized while
language and culture ones are not!
Other controlled value fields (linked to taxonomies)
• levelOfDescription: can enter new terms and will be created on
import (e.g. sub-sub-subseries, etc.)
• subjectAccessPoints, placeAccessPoints,
genreAccessPoints: can enter new terms or link to existing ones.
Can pipe these fields for multiple terms
• descriptionStatus: expects Draft, Revised, or Final
• New terms *may* import correctly, but better to add them to the taxonomy first
• levelOfDetail: expects Full, Partial, or Minimal
• New terms *may* import correctly, but better to add them to the taxonomy first
Other fields of note
• The nameAccessPoints field links to authority records, much like the eventActors for
creators. First it will look for an exact match on name, and then it will create a new stub
authority record if none is found. Be sure to check for typos, and use consistent naming
conventions with your authorities!
• The publicationStatus column sets the public visibility of your descriptions on import.
Expects values Draft or Public. If left blank, the default publication status (in Admin > Settings
> Global) will be used. Note that you can’t have a published description as a child of a Draft
description – this can cause display issues in AtoM!
• Alternative identifiers and their display labels can be imported
using the alternativeIdentifiers and
alternativeIdentifierLabels columns. Use pipe
separators to add multiple values. There should be a 1:1
relationship between the number of identifier values in the
alternativeIdentifiers column and corresponding
labels in the alternativeIdentifierLabels column.
Questions?
info@artefactual.com

More Related Content

What's hot (20)

PPTX
Things I wish I'd known - AtoM tips, tricks, and gotchas
Artefactual Systems - AtoM
 
PPTX
AtoM Data Migrations
Artefactual Systems - AtoM
 
PDF
Creating custom themes in AtoM
Artefactual Systems - AtoM
 
PPTX
Introducing Access to Memory
Artefactual Systems - AtoM
 
PDF
Constructing SQL queries for AtoM
Artefactual Systems - AtoM
 
PPTX
AtoM Implementations
Artefactual Systems - AtoM
 
PPTX
AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...
Artefactual Systems - AtoM
 
PPTX
AtoM's Command Line Tasks - An Introduction
Artefactual Systems - AtoM
 
PPTX
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...
Artefactual Systems - AtoM
 
PPTX
An Introduction to AtoM, Archivematica, and Artefactual Systems
Artefactual Systems - AtoM
 
PDF
Get to Know AtoM's Codebase
Artefactual Systems - AtoM
 
PPT
Encoded Archival Description (EAD)
Farris Wahbeh
 
PPTX
Digital Preservation with Archivematica
Artefactual Systems - Archivematica
 
PPTX
Digital Preservation with Archivematica: An Introduction
Artefactual Systems - Archivematica
 
PDF
AtoM feature development
Artefactual Systems - AtoM
 
PPTX
Controlled Vocabulary.pptx
IhsanSani4
 
PPTX
Css box-model
Webtech Learning
 
PPTX
Latest trends in AI and information Retrieval
Abhay Ratnaparkhi
 
PPTX
Descriptive cataloging 2007 version
Johan Koren
 
PPTX
Library of congress subject heading
MahendraAdhikari7
 
Things I wish I'd known - AtoM tips, tricks, and gotchas
Artefactual Systems - AtoM
 
AtoM Data Migrations
Artefactual Systems - AtoM
 
Creating custom themes in AtoM
Artefactual Systems - AtoM
 
Introducing Access to Memory
Artefactual Systems - AtoM
 
Constructing SQL queries for AtoM
Artefactual Systems - AtoM
 
AtoM Implementations
Artefactual Systems - AtoM
 
AtoM and Vagrant: Installing and Configuring the AtoM Vagrant Box for Local T...
Artefactual Systems - AtoM
 
AtoM's Command Line Tasks - An Introduction
Artefactual Systems - AtoM
 
Building the Future Together: AtoM3, Governance, and the Sustainability of Op...
Artefactual Systems - AtoM
 
An Introduction to AtoM, Archivematica, and Artefactual Systems
Artefactual Systems - AtoM
 
Get to Know AtoM's Codebase
Artefactual Systems - AtoM
 
Encoded Archival Description (EAD)
Farris Wahbeh
 
Digital Preservation with Archivematica
Artefactual Systems - Archivematica
 
Digital Preservation with Archivematica: An Introduction
Artefactual Systems - Archivematica
 
AtoM feature development
Artefactual Systems - AtoM
 
Controlled Vocabulary.pptx
IhsanSani4
 
Css box-model
Webtech Learning
 
Latest trends in AI and information Retrieval
Abhay Ratnaparkhi
 
Descriptive cataloging 2007 version
Johan Koren
 
Library of congress subject heading
MahendraAdhikari7
 

Similar to CSV import in AtoM (20)

PPT
Runtime Environment Of .Net Divya Rathore
Esha Yadav
 
PPTX
Using existing language skillsets to create large-scale, cloud-based analytics
Microsoft Tech Community
 
PPTX
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
PPTX
U-SQL Reading & Writing Files (SQLBits 2016)
Michael Rys
 
PDF
Saurabh_Patel_An Alternative way to Import Multiple Excel files with Multiple...
Saurabh Patel
 
PDF
Veda applications update - Major enhancements
IEA-ETSAP
 
PPTX
Metadata in Omeka
American Antiquarian Society
 
PPTX
Web technologies-course 07.pptx
Stefan Oprea
 
PPTX
Steady
callie140
 
PPT
8051h.ppt microcontroller Assembly Language Programming
anushkayadav3011
 
PPTX
ReadingWriting_CSV_files.pptx sjdjs sjbjs sjnd
ahmadalibzuwork
 
PPTX
End-to-end Data Governance with Apache Avro and Atlas
DataWorks Summit
 
PPTX
CADA
Raju312687
 
PPTX
Python and CSV Connectivity
Neeru Mittal
 
PPTX
CSV File Manipulation
primeteacher32
 
PPTX
Active server pages
student
 
PDF
CSV Files-1.pdf
AmitenduBikashDhusiy
 
PPTX
Using the Archivists' Toolkit: Hands-on practice and related tools
Audra Eagle Yun
 
PPT
Apdm 101 Arc Gis Pipeline Data Model (1)
David Nichter, GISP
 
Runtime Environment Of .Net Divya Rathore
Esha Yadav
 
Using existing language skillsets to create large-scale, cloud-based analytics
Microsoft Tech Community
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
U-SQL Reading & Writing Files (SQLBits 2016)
Michael Rys
 
Saurabh_Patel_An Alternative way to Import Multiple Excel files with Multiple...
Saurabh Patel
 
Veda applications update - Major enhancements
IEA-ETSAP
 
Metadata in Omeka
American Antiquarian Society
 
Web technologies-course 07.pptx
Stefan Oprea
 
Steady
callie140
 
8051h.ppt microcontroller Assembly Language Programming
anushkayadav3011
 
ReadingWriting_CSV_files.pptx sjdjs sjbjs sjnd
ahmadalibzuwork
 
End-to-end Data Governance with Apache Avro and Atlas
DataWorks Summit
 
CADA
Raju312687
 
Python and CSV Connectivity
Neeru Mittal
 
CSV File Manipulation
primeteacher32
 
Active server pages
student
 
CSV Files-1.pdf
AmitenduBikashDhusiy
 
Using the Archivists' Toolkit: Hands-on practice and related tools
Audra Eagle Yun
 
Apdm 101 Arc Gis Pipeline Data Model (1)
David Nichter, GISP
 
Ad

More from Artefactual Systems - AtoM (16)

PDF
Artefactual AtoM Priorities November 2024
Artefactual Systems - AtoM
 
PDF
AtoM Community Update: 2019-05
Artefactual Systems - AtoM
 
PPTX
Creating your own AtoM demo data set for re-use with Vagrant
Artefactual Systems - AtoM
 
PPTX
Looking Ahead: AtoM's governance, development, and future
Artefactual Systems - AtoM
 
PPTX
Contributing to the AtoM documentation
Artefactual Systems - AtoM
 
PDF
Installing AtoM with Ansible
Artefactual Systems - AtoM
 
PDF
Installing and Upgrading AtoM
Artefactual Systems - AtoM
 
PDF
National Archives of Norway - AtoM and Archivematica intro workshop
Artefactual Systems - AtoM
 
PPTX
Artefactual and Open Source Development
Artefactual Systems - AtoM
 
PPTX
AtoM, Authenticity, and the Chain of Custody
Artefactual Systems - AtoM
 
PPTX
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Artefactual Systems - AtoM
 
PPTX
AtoM Community Update 2016
Artefactual Systems - AtoM
 
PPTX
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...
Artefactual Systems - AtoM
 
PPTX
Digital Curation using Archivematica and AtoM: DLF Forum 2015
Artefactual Systems - AtoM
 
PPTX
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...
Artefactual Systems - AtoM
 
PPTX
Introducing the Digital Repository for Museum Collections (DRMC)
Artefactual Systems - AtoM
 
Artefactual AtoM Priorities November 2024
Artefactual Systems - AtoM
 
AtoM Community Update: 2019-05
Artefactual Systems - AtoM
 
Creating your own AtoM demo data set for re-use with Vagrant
Artefactual Systems - AtoM
 
Looking Ahead: AtoM's governance, development, and future
Artefactual Systems - AtoM
 
Contributing to the AtoM documentation
Artefactual Systems - AtoM
 
Installing AtoM with Ansible
Artefactual Systems - AtoM
 
Installing and Upgrading AtoM
Artefactual Systems - AtoM
 
National Archives of Norway - AtoM and Archivematica intro workshop
Artefactual Systems - AtoM
 
Artefactual and Open Source Development
Artefactual Systems - AtoM
 
AtoM, Authenticity, and the Chain of Custody
Artefactual Systems - AtoM
 
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Artefactual Systems - AtoM
 
AtoM Community Update 2016
Artefactual Systems - AtoM
 
Project Documentation with Sphinx (or, How I Learned to Stop Worrying and Lov...
Artefactual Systems - AtoM
 
Digital Curation using Archivematica and AtoM: DLF Forum 2015
Artefactual Systems - AtoM
 
Introducing Binder: A Web-based, Open Source Digital Preservation Management ...
Artefactual Systems - AtoM
 
Introducing the Digital Repository for Museum Collections (DRMC)
Artefactual Systems - AtoM
 
Ad

Recently uploaded (20)

PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
July Patch Tuesday
Ivanti
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 

CSV import in AtoM

  • 1. CSV Import in AtoM Understanding how to prep a CSV file for importing descriptions into AtoM
  • 3. CSV templates on AtoM wiki https://wiki.accesstomemory.org/Resources/CSV_templates
  • 4. CSV templates in AtoM Stored in: lib/task/import/example/
  • 5. Archival descriptions: overview • 2 main templates: ISAD(G) and RAD • Column headers are the same in each – there are just more in RAD. • You could combine columns from the 2 CSVs and it would import successfully – it just may not display all data on a single template! • Column header names are generic, based on database names Header names are generally written in camelCase: • scopeAndContent • Scope and content (ISAD 3.3.1) • Scope and content (RAD 1.7D) • physicalCharacteristics • Physical characteristics and technical requirements (ISAD 3.4.4) • Physical condition note (RAD 1.8B9a)
  • 6. Crosswalking with CSV files https://wiki.accesstomemory.org/Resources/CSV_templates In the example CSV files from v2.2 on, we have included the relevant content standard name and number in the sample data field. This means you can import the CSV template to produce a sort of “crosswalk” or key, showing you how fields in AtoM map to the column headers, and to other standards templates (such as DACS, MODS, DC).
  • 8. You can preview data in another standard without having to edit, by adding a semi-colon and the standard’s abbreviation to the end of its URL: Standard template Add to URL ISAD(G) ;isad RAD ;rad DACS ;dacs Dublin Core ;dc MODS ;mods Crosswalking with CSV files
  • 9. Character encoding and line endings AtoM expects CSV files to use: • UTF-8 character encoding • Unix/linux style end of line characters UTF-8 (short for Unicode Transformation Format – 8-bit) is a standard for encoding characters (such as letters, numbers, punctuation, and special characters) in digital formats. UTF-8 is backwards compatible with ASCII, the standard that predated it, and has become the de facto encoding standard used on the web and in email. “In computing, a newline, also known as a line ending, end of line (EOL), or line break, is a special character or sequence of characters signifying the end of a line of text. The actual codes representing a newline vary across operating systems, which can be a problem when exchanging text files between systems with different newline representations” (Wikipedia). These character are not typically made visible in the document to users. Here are some of the differences: • Unix / Linux / FreeBSD / OS X use LF (line feed, n, 0x0A) • Macs prior to OS X use CR (carriage return, r, 0x0D) • Windows / DOS use CR+LF (carriage return followed by line feed, rn, 0x0D0A) Failure to use these can lead to import errors, such as strange characters, or thousands of blank descriptions being created!
  • 10. Character encoding and line endings We strongly recommend against using MS Excel as your local spreadsheet application when preparing AtoM CSV data! Microsoft likes to use its own custom character encoding (Windows-1252), and makes it very difficult to change the default encoding to UTF-8. Similarly, Excel tends to use different line endings, which can cause import issues. LibreOffice Calc is an open source spreadsheet application that makes a good alternative for AtoM CSV preparation. When first opening a CSV file, it will allow you to set the character set, separators, text delimiter, and see a preview of the fields with your selection to ensure everything is rendering properly.
  • 11. Hierarchical data: legacyId and parentId • Can be used to manage hierarchies within a file • legacyId values can be arbitrary – alphanumeric values. Should be unique within the CSV file • parentId values should use the legacyId value of the target parent record. For example: • IF a collection has a legacyId of 001 • AND the next row is a series that should be a child of the collection • THEN the parentId value of the series should be 001 • If a record is a top-level description (such as a fonds or collection, etc), then leave the parentId column blank
  • 12. Hierarchical data: legacyId and parentId IMPORTANT: Child records MUST appear AFTER (i.e. below) their parent records in the CSV row order, otherwise the import will fail! Child and sibling order doesn’t matter, so long as the parent row is ABOVE
  • 13. Hierarchical data: qubitParentSlug • Can be used to link imported descriptions to existing descriptions in AtoM as children • qubitParentSlug values should be the slug (or permalink – the unique part of the URL after the slash) of the target parent description in AtoM • WARNING: do not use parentId and qubitParentSlug in the same row! • If both are present, AtoM will default to using the qubitParentSlug • Can still use a mix of parentId and qubitParentSlug throughout the CSV – just not in the same row • All descriptions should still have a legacyId value added
  • 14. Linking to Accession records on import If you want to link incoming descriptions to an existing accession record, you can add a new column to the description CSV, with the header: accessionNumber NOTE: if the accession number does not exist, AtoM will create a blank stub accession with that number! Make sure you add the correct Accession number :)
  • 15. Dates and creators (events and actors) Description edit templates have 3 date fields. The Display date is what the end user will see - it is free text. The start and end dates must follow ISO 8601 (YYYY-MM-DD, etc) formatting. These fields are used to support AtoM’s date range search. • Display date • Start date • End date
  • 16. Dates and creators (events and actors) During CSV import, Creators and Dates are paired Use the | pipe character to add multiple creators/dates. You can use a literal NULL value in your CSV file to keep the spacing correct for dates without actors or vice versa: eventTypes: Sets the type of event. Different options based on default template used in AtoM: • ISAD: Creation, Accumulation • RAD: Creation, Accumulation, Contribution, Collection, Broadcasting, Manufacturing, Custody, Publication, Reproduction, Distribution • DC: Creation, Contribution, Publication • DACS: Creation, Publication Broadcasting, Record-keeping activity • MODS: all
  • 17. Dates and creators (events and actors) eventActors and eventActorHistories are related to Authority records. AtoM will first try to find an existing matching authority record, and will link to it if found. If not, AtoM will create a new stub authority record. Matching criteria: # Import type Match on name Match on repository Match on history Outcome 1 Any Yes Yes (or blank) Yes (or blank) Linked (no updates) 2A New or delete/replace Yes Yes No New actor created 2B Update Yes Yes (or blank) No Linked AND existing history updated 3 Any Yes No Yes (or blank) Linked (no updates) 4 Any Yes No No New actor created 5 Any No Yes (or blank) Yes (or blank) New actor created
  • 18. Dates and creators (events and actors) Remember if you pipe values to include more than 1 creator per row, you still need to pipe across all event fields! Use NULL as a placeholder where needed. IMPORTANT: AtoM includes creator inheritance! Meaning if you have added a creator at the top level, you do NOT need to add them in the CSV at lower levels again – in fact this can cause performance and display issues later! Instead, simply add the dates and use the NULL placeholder for the eventActor and eventActorHistories fields:
  • 19. Dates and creators (events and actors) eventPlaces and eventDescriptions: These fields are visible only in the RAD and MODS templates in AtoM. You could add these columns to an ISAD CSV template and it would import without errors, but they will not show in the ISAD template user interface! These values also need to be piped to match up with the correct event when importing multiple events per row.
  • 20. Digital object columns in the description CSV You can import digital objects at the same time you create new descriptions. Note that AtoM has a 1:1 relationship between a digital object and an information object (i.e. archival description) – you can only attach one object per description. For local objects on the same server – use the digitalObjectPath column (you can add it to the CSV if it is not in the template)
  • 21. Digital object columns in the description CSV You can also import remote digital objects available on the web, using the digitalObjectURI column. AtoM will create local derivatives, but will only store the path to the master, rather than retaining a local copy. Requirements for a remote digital object import to succeed: • Must be publicly available on the web (no logins, firewalls, VPN requirements, etc.) • Must be an HTTP or HTTPS link (FTP links will not work) • URI must end in the file extension (e.g. .jpg etc. Can’t link to a landing page such as a YouTube video this way – AtoM needs to be able to fetch the actual object!)
  • 22. Physical storage columns Similarly to creators, you can link to or create new physical storage containers and locations in a description CSV import • For the import to succeed, all 3 fields must be populated for each row: physicalObjectName, physicalObjectLocation, and physicalObjectType • Can pipe values in a row to add multiple locations • AtoM will first attempt to match based on all 3 rows – if no match is found, a new container and location will be created on import • physicalObjectType is linked to the Physical Object Type taxonomy in AtoM
  • 23. Culture, language and script fields • culture: determines the source language of the description in AtoM. We recommend populating for all rows. Expects ISO 639-1 two-letter language code values (e.g. en, fr, es, etc.) Must have the culture added to Admin >Settings > i18n languages for proper display! • See a list of AtoM supported langs and codes here: bit.ly/AtoM-langs • language (of material) and languageOfDescription both also expect ISO 639-1 two-letter language code values. Can accept piped values for multiple languages, e.g. en|fr|es • script (of material) and scriptOfDescription expect four-letter ISO 15924 script code values – e.g. Latn, Cyrl, etc. Can also pipe to enter multiple values per row. Note that these codes are capitalized while language and culture ones are not!
  • 24. Other controlled value fields (linked to taxonomies) • levelOfDescription: can enter new terms and will be created on import (e.g. sub-sub-subseries, etc.) • subjectAccessPoints, placeAccessPoints, genreAccessPoints: can enter new terms or link to existing ones. Can pipe these fields for multiple terms • descriptionStatus: expects Draft, Revised, or Final • New terms *may* import correctly, but better to add them to the taxonomy first • levelOfDetail: expects Full, Partial, or Minimal • New terms *may* import correctly, but better to add them to the taxonomy first
  • 25. Other fields of note • The nameAccessPoints field links to authority records, much like the eventActors for creators. First it will look for an exact match on name, and then it will create a new stub authority record if none is found. Be sure to check for typos, and use consistent naming conventions with your authorities! • The publicationStatus column sets the public visibility of your descriptions on import. Expects values Draft or Public. If left blank, the default publication status (in Admin > Settings > Global) will be used. Note that you can’t have a published description as a child of a Draft description – this can cause display issues in AtoM! • Alternative identifiers and their display labels can be imported using the alternativeIdentifiers and alternativeIdentifierLabels columns. Use pipe separators to add multiple values. There should be a 1:1 relationship between the number of identifier values in the alternativeIdentifiers column and corresponding labels in the alternativeIdentifierLabels column.