We are requesting comments on a draft of an upcoming NARA Bulletin on minimum metadata requirements for transferring permanent electronic records to NARA. There are three parts to the draft (links are to PDF versions): the draft bulletin, Metadata Element Definitions, and Recommended file and folder naming conventions.
The Bulletin applies to all electronic records that have been scheduled as permanent and describes the minimum set of metadata elements that must accompany those records when they are transferred to NARA. Please make your comments about the draft Bulletin and any suggestions by August 22, 2014. We will review all the comments we receive.
Thank you for your input!
Metadata Appendix A starts out: The tables below provide the minimum list of metadata terms necessary for describing permanent electronic records. Will this mean that some records, which may be of great interest, cannot be ingested, because of a lack in one or more of these “minimum” terms?
I can foresee instances where the data are not available to the person trying to offer the records to NARA. Would this mean that the agency would eliminate them as non-viable records, because of a lack of metadata elements? I will assume that PERM records which are defective, such as rotting film or in this case those with flawed metadata, will be destroyed?
The specific elements listed in the Draft Bulletin do not match those in the Appendix section. However, both say they are mandatory?
Hello,
I have about 6 inches of paper files on air pollution information that I may transfer to the Archives. Do you have a web link or a human person to discuss this with. Do all redords to be submitted have to be in Adobe. Thanks for the info.
Bruce – Thanks for your comments. I will see that the development team gets them. Andy – Someone from our staff will reach out to you.
Thank you for posting this for public comment. I recommend the following:
1. Directly consuming the Open Data Policy structure and schema – project-open-data.github.io/schema/. It is also standards-based (DCAT) but also is seeing widespread adoption by agencies in related arenas.
2. Open source the schema as well as other related documents and instructions in the model of Project Open Data – http://project-open-data.github.io and https://github.com/project-open-data/project-open-data.github.io. There are substantial benefits of allowing all interested parties to collaborate with your efforts via issues and pull requests – https://github.com/project-open-data/project-open-data.github.io/issues.
I would love to look at the bulletins but I’ve been trying to download them without success since yesterday morning! If someone could email me a copy, I would be hugely grateful.
For the term RecordID include an example for the unique identifier
For the term “Creator” examples which include a person’s name, should include their position title as there could be more than one person with the same name.
Please define the term “Unambiguous date” this could be deceiving as meaning when a draft was created or when a memo was approved.
Most of our records contain Confidential Business Information (CBI), so there should be safeguards in place so that the information is accessed only by those who have been cleared for TSCA CBI. Are there any special provisions for electronic records that contain CBI?
The metadata requirements and the nameing conventions are extremely important. The 11-day turnaround for comments is not sufficient for us to involve the affected groups: IT, CIO, Webmaster, Bureaus creating files. Can we have an extension?
I am hoping that the antiquated approach to dates in file names (month_day_year) can finally be updated to (yyyymmdd). Any chance of that?
In section 6 when you designate a Pipe delimited CSV as an index, is there intended there be a header row that lists all the Metadata element names so that it is possible to know which pipe separated value corresponds to what metadata element, or is there some other way to designate the order of the elements in the CSV? (Or is this the standard order and is that how it to be defined?) Is the first value in every line to be the Transfer Request number or there some other way to designate that since it should be the same for the whole file.
A sample of an entire indexing file in addition to the examples of the individual elements provided would make that much clearer, at least to me.
First, thank you for taking the time to conduct public outreach. The Sunlight Foundation would like to suggest the following:
1. The schema as currently laid out does not provide for documents that may already be available to the general public via the internet (or other means). Consider adding a metadata field describing where documents that are publicly available reside (a link to their webpage)
2. We add our support for the comments posted by Gray Brooks at GSA, particularly his second suggestion. Seeking out public feedback is important. Open sourcing related documents and viewing this metadata guidance as a living document would be a valuable step forward, ensuring that all feedback is heard and helping the guidance stay fresh and useful moving forward. Project Open Data (POD) provides an excellent example of the utility of this approach. The POD Schema has grown more robust since its introduction thanks to its presence on Github and continued efforts at public collaboration by the POD team. With this in mind, we would also urge you to consider adopting the POD schema or adapting it to the particular needs of the National Archives.
Finally, it is worth pointing out another government that is approaching their metadata review in a similarly public manner. In July, San Francisco, CA released (http://datasf.org/blog/?p=36) a draft of their new metadata standard as well as related documents(http://datasf.org/blog/?page_id=48), and asked for public feedback using a short survey(https://docs.google.com/forms/d/1u0SSeBUEdKiWAssYEAW-bFgsp0CNol7PeG_aHUE7yvY/viewform).
It is heartening to see government’s on multiple levels thinking strategically and openly about their metadata policies.
Maybe the Danish executive order on SIPs will come in handy at some point:
http://sa.dk/media(3367,1033)/Executive_Order_on_Submission_Information_Packages.pdf
In order to make the metadata usable for the entire lifecycle of the records, why are you not aligning your metadata requirements/standards with NARA’s already existing descriptive metadata standards in NARA’s Lifecycle Data Requirements Guide (LCDRG)? These standards are used in its public facing online catalog and else where. The definitions and examples you provide for many fields are different from those found in the LCDRG and using the guidance for these fields as provided in your draft would result in metadata being received from agencies that NARA would need to update or change to meet its own description standards and needs. For instance, in your examples under creator you say that Department of State or John Kerry could be used. Using the LCDRG standards, the specific office that created materials should be provided rather than simply the agency’s name or the name of a person within that agency. For example, “Department of State. Office of the Under Secretary of State for Security Assistance. (1972 – 08/22/1977)” rather than simply “Department of State.”
We have gone into a bit more detail in a blogpost http://sunlightfoundation.com/blog/2014/08/19/nara-seeks-public-comment-on-metadata-guidance/
We think that this set of metadata is good as minimum requirements for transferring permanent electronic records to NARA. At the same time, we have some questions. Often, federal agencies’ documents have several versions, redactions, etc. Do you think it would be useful to add more “relation” elements, i.e. Relation[isReplacedBy] and Relation[isVersionOf]? Also, taking in consideration that for some official documents the publishing date is important, do you think it would be useful to add Date[issued]?
I would like to reiterate what other commenters have noted about OMB’s Project Open Data (http://project-open-data.github.io/schema/), where they set guidance on ‘common core’ metadata. They call it guidance, but it will be treated as a requirement for anything on agency.gov/data.
At a federal level, it is foreseeable that groups in addition to Open Data, National Archives, and even National Institutes of Standards and Technology (NIST) will want to add metadata standards (possibly the near future). I see the need for federal level governance for the metadata that provides a single set of guidance for the agencies. The drive is for one set of federal-wide core metadata standards, rather than two, or three, or more as others jump on the metadata wagon. If we keep NARA metadata guidance and Project Open Data Guidance separate, in places where the two converge and conflict (things happen), we at the implementation level of the agencies are not in a position to broker a deal between NARA and Project Open Data without that governance board. Think about how this would get even more complicated if other groups add on additional metadata guidance.
FYI, coming out with the metadata guidance is quite timely and appreciated.