Bloomsbury Conference UCL, London 6.25.10 Fourth Bloomsbury Conference

Bloomsbury Conference UCL, London 6.25.10 Fourth Bloomsbury Conference

Bloomsbury Conference UCL, London 6.25.10 Fourth Bloomsbury Conference on e-Publishing and e-Publications Valued Resources: Roles and Responsibilities of Digital Curators and Publishers Conceptualizing Library Data Curation and Publishing Services at Purdue University Charles Watkinson Director Purdue University Press D. Scott Brandt Assoc Dean for Research Purdue University Libraries Bloomsbury Conference UCL, London 6.25.10

Structure of the Presentation I. Some Background & Context II. Exploring Librarys Role in the Data Deluge III. Data Curation Profiles: what were learning IV. What a Publisher can learn from the Profile Data curation is the activity of managing and promoting the use of data from the point of creation, to ensure its fitness for contemporary purposes and availability for discovery and reuse. Bloomsbury Conference UCL, London 6.25.10 Purdue University and Purdue Libraries ~38K students, ~1.8K faculty Strengths in science, technology, agriculture, & engineering. 12 subject-oriented Libraries + units University press a unit (only 11% of US presses report within Libraries) Directors of Office of Copyright,

Finance, and the University Press Assoc Dean for Digital Programs and Information Access Assoc Dean for Planning & Administration Bloomsbury Conference UCL, London 6.25.10 published data/ datasets unpublished research

published research published research non-traditional non-traditional traditional secondary/ tertiary resources analyzed data/ datasets Analyzed data might need to be reviewed prior to publication, or

in case of questions after publication. It is increasingly linked as supplementary data by publishers processed data/ datasets Quite often data must be scrubbed/anonymized, or processed to format prior to analysis; some disciplines share this data widely within their communities (e.g., astronomy, physics, etc.) raw data/ datasets Some raw data are shared readily (e.g., genetics), but also quite often are discarded, depending on discipline Modified from: Brandt, D.S. Scholarly Communication (in To Stand the Test of Time: Long-Term Stewardship of Digital Data Sets in Science and Engineering.: Final Report of Workshop New Collaborative Relationships: Academic Libraries in the Digital Data Universe. ARL, Washington, DC, September 2006.) Bloomsbury Conference

UCL, London 6.25.10 PUL response to data deluge Investigating research data needs and building relationships with faculty, in order to: Design, build, assess prototype infrastructure, tools and services to handle digital data. This approach recognizes the disciplinaryspecific nature of faculty needs, though there is a tension between this and the practical requirements of building a sustainable suite of services/digital infrastructure. Bloomsbury Conference UCL, London 6.25.10 Our organization to achieve this vision Faculty Liaison subject librarians

Publishing e-Pubs & Press Data Management D2C2 Rights Management University Copyright Office disciplinary faculty Bloomsbury Conference UCL, London 6.25.10 1. Investigating Research Data Needs Strategy 1: Embedding data scientists in research projects; D2C2 provides this expert consultancy. Strategy 2: Creating tools to structure conversations about

data; Data Curation Profiles help liaison librarians structure their conversations. DCP D2C2 librarians researchers Bloomsbury Conference UCL, London 6.25.10 2. Solving Problems and Developing Prototype Tools, Systems, Services Study Concept & Design

Data Collection Data Processing Data Access & Dissemination Analysis Research Outcomes Ingest, Preservation and Access for Water Quality Datasets in an Institutional Repository Developing a Data Management and Curation Workflow for Camp Calcium Developing a Content Organization

Framework for Regenstrief Center Healthcare Delivery Hub Enabling end-to-end geospatial data modeling workflows via INPort: The Isotope Networks Portal Leveraging Relational Information in the HUBs using Linked Data Investigate and Implement Persistence for HUB Resources DataCite (founding member) Integrating Spatial Educational Prototype publications linked to Experiences (ISEE) into Crop, Soil, and data through e-Pubs and Purdue Environmental Science Curricula INTEROP: Developing Community-based University Press. DRought Information Network Protocols and Tools for Multi -disciplinary Regional Scale Applications Adapted from: e-Science and the Life Cycle Model of Research

http://datalib.library.ualberta.ca/~humphrey/lifecycle-science060308.doc Bloomsbury Conference UCL, London 6.25.10 Data Curation Profiles Bloomsbury Conference UCL, London 6.25.10 Profiling Data Research Data Lifecycle (whats the story of the data from producer's perspective) Data Management / Storage Disposition of the Data Data Dissemination and Sharing Data Preservation and Repositories Roles for Libraries, Librarians, and Publishers Sample Profile link

Bloomsbury Conference UCL, London 6.25.10 Disposition of the Data Willingness / Motivations to share feelings/reservations/willingness towards sharing Access control need to restrict or control access to/from others Target data for sharing stage in the lifecycle the data should be shared Value of the data real or potential value, from their perspective Embargo (and reasons why/why not) Bloomsbury Conference UCL, London

6.25.10 What data curators can learn Advancing university-based cyberinfrastructure is dependent on our understanding of how to support data practices and needs Sharing is at the heart of success: collecting, storing, and making use of data can only come after the means for sharing are in place We cannot collect and curate all data, particularly in a way that facilitates effective re-use We will need to work with researchers to develop selection and appraisal guidelines, and data services from: M. Cragin. (2009) Data Sharing, Small Science, and Institutional Repositories. UK e-Science All Hands Meeting: Oxford, UK Bloomsbury Conference UCL, London 6.25.10 Data Curation Proliferation

DCP 12 workshops dataconservancy.org Bloomsbury Conference UCL, London 6.25.10 What publishers can learn Researchers want to disseminate outputs, but ranges in scope, format, use They are generally willing to share data with others, but not without certain restrictions, or benefits for themselves They hold on to their data but do not do much to curate it; what is most easily or willingly shared is not always the data that has the most re-use value

Bloomsbury Conference UCL, London 6.25.10 Purdue UP lesson learned 1 Researchers want to disseminate outputs, but ranges in scope, format, use Print books and subscription-based journals, PUPs traditional focus, are not enough PUP / Libraries need to offer a range of different channels to fit different needs PUP / Libraries need a venue to experiment with hybrid or new models Bloomsbury Conference UCL, London 6.25.10 A Continuum of Scholarly Content in the IR Student Admin

Unaffiliated Source of scholarship Faculty (with thanks to J.G. Bankier, Berkeley Electronic Press) Book Pre Print Datasets Faculty Journal /Primary Post Print Faculty Conference Non-researresearch ch output Research Finding Committee Meetings Research Reports Newsletter Dissertation Masters Thesis

Graduate Journal Honor Papers Undergrad Conference Undergrad Journal Admin Report Red stars = Purdue UP? Blue stars = Purdue e-Pubs? Alumni Magazine Historical Collection Commencement ad dress Low Symposium Society Journal Policy Report Scholarly Impact of Content

High Bloomsbury Conference UCL, London 6.25.10 Purdue UP lesson learned 2 Researchers willing to share data with others, but not without certain restrictions/benefits PUP provides a layer of editorial services for credentialing that can incentivize data sharing PUP needs to make it easy to link to and cite data in publications (Datacite so important!) PUP / Libraries need to be nuanced in their Open Access messages (OA is not always right strategy) Bloomsbury Conference UCL, London 6.25.10 Read the full text of the book on your

portable device Follow in-text URLs to supplementary data View spreadsheets on-site or download them from your personal computer Bloomsbury Conference UCL, London 6.25.10 Purdue UP lesson learned 3 What is most easily/willingly shared is not always data that has the most re-use value Move away from producing data supplements for publications to producing supplementary publications to drive re-use of data Take advantage of being inside the tent to have deeper conversations with scholars about what is most important data for reuse

Bloomsbury Conference UCL, London 6.25.10 Next Steps Spreading the use of DCPs so that we can get a more complete picture of faculty behavior variations around data More clearly defining library-based publishing services, and building relevant skills and tools in Libraries and Press Communicating to faculty the full range of library services they have access to, and changing their old views of what Purdue Libraries and Purdue UP do Bloomsbury Conference UCL, London 6.25.10 Thank you!

D. Scott Brandt [email protected] Charles Watkinson [email protected]

Recently Viewed Presentations

  • Kernel Regression Based Image Processing Toolbox

    Kernel Regression Based Image Processing Toolbox

    Kernel Regression Based Image Processing Toolbox for MATLAB Hiroyuki Takeda Multi-Dimensional Signal Processing Laboratory University of California, Santa Cruz Directory Structure Kernel Regression This directory contains the main functions of kernel regression. Support Functions This directory contains the sub functions...
  • HIM Education -- Vision 2016 White Paper -- Timeline

    HIM Education -- Vision 2016 White Paper -- Timeline

    AcademicTimeline for HITECH & ICD-10-CM/PCS. August 1, 2011: Associate/Baccalaureate: Year One coding courses must present hybrid coding schemes - both ICD-9 and ICD-10 in order to prepare the two year student to convert completely to ICD-10 in Year Two coding...
  • Personalised care and support Evaluation Baseline report

    Personalised care and support Evaluation Baseline report

    Evaluation of personalised care and support. Baseline - July to September 2018. Evaluation supported throughout with patient representatives working collaboratively, providing their opinions, sharing decisions and offering advice from the patient perspective
  • The Royal family

    The Royal family

    The film was released in 2006. It stars Hellen Mirren , Michael Sheen, James Cromwell, Alex Jennings, Roger Allam and Sylvia Sims. The film was directed by Stephen Frears. It is a fictional drama. It is about the life of...
  • www.mylivingwill.org.uk

    www.mylivingwill.org.uk

    www.mylivingwill.org.uk. A Joint presentation by Professor Isky Gordon (one of the founders) and Dr Sally Higginbottom GP at Caversham Group Practice
  • Four Decades of Systems Science Teaching and Research in the ...

    Four Decades of Systems Science Teaching and Research in the ...

    SySc transitioned to a stand-alone program. Added graduate certificates. Added SySc MS degree. Added "multi-disciplinary" track. Sought to complement not compete with departments. In 2010, began to create an undergraduate presence
  • Open Grid Computing Environments Marlon Pierce, Suresh Marru,

    Open Grid Computing Environments Marlon Pierce, Suresh Marru,

    Java COG, GTLAB, JavaScript COG Build and package portal services GPIR Provide workflow tools: XRegistry, GFAC, XBaya Browser Timeā€¦ Future Work, Part 1 We have several substantial components to support, extend.
  • CMFT Trainee Conference Draft Programme Time Topic 0900

    CMFT Trainee Conference Draft Programme Time Topic 0900

    Fiona Spencer. 0915. Keynote Speaker. Dr Margaret Kingston "Developing your own Unique Selling Point" 1000. Workshop 1. 1100. COFFEE. 1115. Workshop 2. 1215. LUNCH. 1300. Becoming a Consultant - things I wish I'd known, and things I've yet to find...