Historical data analysis

June 21st, 2009

Check out this week’s Fifth Wave cartoon:

http://www.gocomics.com/thefifthwave/2009/06/21

Conference webcasts and presentations online!

June 9th, 2009

IASSIST/IFDO 2009

IASSIST/IFDO 2009


05 June 2009 13:19
A week has passed since IASSIST 2009. I hope most of you have made it safely back home by now – and are ready to refresh the memories by watching the conference webcasts and viewing presentations. Webcasts of all three plenaries and Thursday and Friday’s concurrent sessions in the Small Auditorium are now available. We didn’t have cameras available during the Wednesday sessions, so no videos of these presentations, sorry! But most of the presentations are already online – a few are still missing either because we didn’t have them or we are waiting for an updated version. Please send in any missing presentations or email me if there are mistakes that should be corrected!

Tuomas J. Alaterä
Information Network Specialist
tuomas.alatera@uta.fi
Finnish Social Science Data Archive (FSD) http://www.fsd.uta.fi
FI-33014 University of Tampere

ICPSR Summer Program course on DDI

June 9th, 2009

As you may have heard during last week’s wonderful IASSIST meeting in Tampere, there are still spaces available in the ICPSR Summer Program course on DDI, to be held at Cornell University in Ithaca, New York, on July 13-16. The course is titled “Documenting Data Using DDI 3.0: Supporting Research, Collection Management, and Access,” and instructors are Wendy Thomas (Minnesota Population Center, University of Minnesota) and Arofan Gregory (Open Data Foundation).

The DDI 3.0 metadata structure is capturing the attention of major archives, data producers, and funding agencies as a standard that will aid in the collection, preservation, and dissemination of social science data throughout its life cycle. This course will show you how DDI can be used to support the research process, making the data you collect easier to manage, share, and analyze.

This special off-site course will take advantage of Ithaca’s unique location in the heart of the Finger Lakes region of New York State. Activities will include visits to the internationally-renowned Moosewood Restaurant, hiking opportunities to see some of the best gorges and waterfalls in the area, and a day-long post-conference visit to the Finger Lakes Wine Festival!

Who should attend: Anyone managing numeric social science data collections or a social science research project involving the collection of numeric data

Location: Ithaca, NY

Dates: July 13-16, 2009

Co-sponsors: Cornell Institute for Social and Economic Research; Mann Library

For more on the course and registration, see: http://www.icpsr.umich.edu/cocoon/sumprog/course/0109.xml

Mary Vardigan
Assistant Director, Inter-university Consortium for Political and Social Research (ICPSR)

Special IQ: Moving Research Data Into and Out of Institutional Repositories

May 12th, 2009

The IASSIST Quarterly IQ Vol. 31 issue 3&4 is now available on the web:

http://iassistdata.org/publications/iq/iqvol31.html

This issue will only be available on the web. There will be no printed version mailed out to the membership.

This double issue is the work of the authors and their articles are introduced below. We are presenting an integrated double issue of high quality. We should also give a special thanks to the editors of the issue. Gretchen Gano is the writing guest editor of this IQ as you can see below. Gretchen Gano is the Assistant Curator Librarian for Public Administration & Government Information and Coordinator, Data Service Studio at New York University Libraries. Gretchen Gano collaborated on this issue from the start with former IASSIST president Ann Green. Together with the authors a great issue has been made.

Enjoy

Karsten Boye Rasmussen, IQ editor, associate professor, kbr@sam.sdu.dk,
Marketing & Management, SDU, University of Southern Denmark +45 6550 2115

Guest Editor’s Notes:

The 2008 IASSIST Conference, “Technology of Data: Collection, Communication, Access and Preservation” included a session entitled “Moving Research Data Into and Out of Institutional Repositories” from which several papers emerged. In “Interoperability Between Institutional and Data Repositories: a Pilot Project at MIT”, Katherine McNeill describes a pilot project to enhance study discovery between two repository systems housed in the same institution, DSpace and the Institute for Quantitative Social Science Dataverse Network, by enabling the
harvesting and replication of metadata and content across the two systems. In a related project across the pond, Libby Bishop scales this discussion in her description of crossinstitutional collection sharing between the University of Leeds and the UK Data Archive in the Timescapes project. Bishop asserts that coordination among multiple agents is likely to be challenging under any circumstances. Challenges magnify when the trajectories of different life cycles, for research projects and for data sharing, are considered. Robin Rice echoes these sentiments in her article on the DISC-UK DataShare Project, a collaboration between the Universities of Edinburgh, Oxford and Southampton and the London School of Economics. Rice provides visual evidence in a compelling diagram of the data sharing continuum based on storage, discovery, and preservation conditions of the digital research materials at each level along the scale — from the lowly thumb drive to the officious national archive. We see plainly that as one moves up the continuum, more and more human effort and intervention is required to craft the discovery, access, analytic and preservation environment. In other words, data curators matter.

Two other papers tackle these challenges by emphasizing the needs of data producers. Luis Martinez-Uribe introduces the University of Oxford’s Scoping Digital Repository Services for Research Data Management project and the findings of a requirement gathering exercise. While the study results reveal researchers’ needs and workflows. Martinez-Uribe asserts that the study process itself made an impact on the participants. Study participants reflected on and, as a result, fine-tuned how they work with data, why they create these materials in the first place and were able to articulate reasons for managing these resources the way they do. Similarly, Research Data & Environmental Sciences Librarian, Gail Steinhart, writes about the development of DataStaR, a Data Staging Repository hosted by Cornell University’s Albert R. Mann Library. The project developed as a “managed workspace” where researchers contribute datasets they are still actively using in direct response to questions that have to do with sharing in the active research environment, rather than an archival one.

While the authors in this issue describe projects going on in many different places and settings, taken together, these articles address common themes. All address the challenge of scaling data exchange between systems and then between institutions. This raises the perennial question of standards: by what mechanisms will we set them, and how well will we be able to follow them and still accommodate
local needs? The importance of aligning repository services with researcher needs is another common thread. Data managers must ask, “how will the active researcher benefit from curation efforts”? The answer may be that benefit is more than finding or accessing a particular resource (yep, I have downloaded the whole thing and all the bits are there), but instead being able to examine this resource in many
ways (okay, lets run frequencies, now I want to see it on a map, and let’s include some other variables). This is a rich reuse experience, creating a real digital “laboratory.”

Finally, each contributor notes the expanding role of data manager. In its own way, each project described here moves data managers upstream, pre-publication, into the place where research is actively happening. Though all of the articles focus on technological choices and architectures to support research data curation, it is striking to realize that each of these choices emerge from old-fashioned personal,
social, and organizational relationships. What we can strive for as data and information managers is to work together as fellow researchers and to be ever curious about how these partnerships and the sharing of information back and forth
can be enhanced by thoughtful information and technology design. Some call this the digital plumbing, but I like to think of it as e-gilding.

Gretchen Gano, New York University Libraries

Google Launches Data Visualization Service

May 5th, 2009

Several weeks ago, Google contacted me at BLS to let us know they were using some of our data in a launch of a new service in data visualization.  Their plan is to make as much data available as possible with as rich a tool set as they can provide.  To see an example, enter the phrase “US Unemployment Rate” in the Google search box.  The top link sends you to a page that allows you to superimpose historical graphs of unemployment rates down to any county in the US.  There is a link at the bottom of the page for “Information for Publishers” for people interested in letting Google know about their own data.

Google’s inquiry took us by surprise, as we have no working relationship with them.  Our main concern was whether their launch would impose an undue burden on our web servers, but we decided it would not.  Every time we release a new estimate for the Unemployment Rate or the Consumer Price Index we experience a sharp spike in web activity on our site.  So we were prepared.  The actual launch did not contain any surprises.

I spoke with some people at Google about their plans.  The seem to have a good appreciation of statistical data and metadata.  They even know about the SDMX standard for describing time series data.  It tums out, the program manager for this work at Google is Ola Rosling, the son of Hans Rosling, who is the person behind the Gap Minder project for data visualization.

The world seems to be getting smaller every day.

- Dan Gillman

US Bureau of Labor Statistics

New IQ!

March 26th, 2009

The IASSIST Quarterly (IQ Vol. 31 issue 2 – 2007) is now available on the web:

http://iassistdata.org/publications/iq/iqvol31.html

This issue will be printed and mailed to the membership. From next issue IASSIST will be saving trees and only publish the IQ on the web. We hope you agree with our decision. Thanks.

Some of you are now getting ready for the IASSIST conference in Tampere, Finland 26.-29. of May. Remember if you are giving presentations that you could write the presentation as an article for the IQ.

In this issue we have three papers from people working at the US Federal Reserve Board. Viewed from posterity, it might look as if we at the IQ were clairvoyant and in 2007 foresaw the global role for the FRB in the financial crisis in the last quarter of 2008. The secret is first of all the fact that volume 31-2 is the second issue of the 2007 volume but is somewhat delayed, and we are writing in November 2008. Secondly, the articles from the Federal Reserve Board carry opinions that “are of the authors and not the Federal Reserve Board”. As an author in the IQ you are supported in expressing your opinions and not necessarily those of your employer. Thirdly, these three articles are not about the financial crisis, but hopefully some of the initiatives that are described in them will help us in the current situation.

Linda F. Powell and Andrew Boettcher from the Board of Governors of the Federal Reserve System (Washington, D.C.) are involved in the collection, editing, storage, and dissemination of Commercial Bank Reports of Income and Condition, and the use of the Extensible Business Reporting Language (XBRL-format) for that purpose. Their article is called “Modernizing Financial Data Collection with XBRL”. XBRL can be thought of as a set of accounting standards coupled with information technology standards that simplifies the exchange of data. What was earlier accomplished through a manual collection is now using XBRL for a Call Report – a regulator-specified report for about 7,700 banks that are required to file a quarterly report, containing over 2,000 variables. The article addresses the challenges: 1) multiple collection and storage sites, 2) difficulties for the industry in implementing changes to the data collection requirements, and 3) improvements to data quality. This involves centralization in the new collection model by submitting the data to the Central Data Repository. The article states that “Financial theory suggests that more frequent, reliable, and readable financial statement reports will result in a healthier marketplace”. Since the presentation of the article at the IASSIST 2008 conference in May we have experienced a financial crisis. Let us hope for further refinements in this area as the XBRL is being used by government regulators worldwide.

At the IASSIST 2008 conference Andrew Boettcher presented from the FRB a metadata repository called the Data And News CataloguE (DANCE). The article “Data and Knowledge Management at the Federal Reserve Board” chronicles the role of DANCE in the organization and its transformation into a knowledge management solution. When research projects were always intra-departmental the departments tended to silo the data, documentation, and expertise. Now, the number of multi-department research projects is rising and more linking is needed. The DANCE development staff then focused on the concepts: description, access rights, contact information, and data location. One issue was that all datasets needed to have standardized documentation. Each dataset has a unique set of security requirements dictating usage rights, how access is granted (request form), and publication rights. The article also addresses the searching of datasets and the additional feature of allowing user-generated content supported by wiki-pages for the dataset.

As the electronic information environment is shifting, the presentation of information from the Federal Reserve Board is becoming far less important. San Cannon is Chief at the Economic Information Management at the FRB; she presented “Snippets of Data at a Glance: Using RSS to Deliver Statistics” at the United Nations Economic Commission for Europe’s Dissemination and Communication Work Session in Geneva (Switzerland) in May 2008. An early version was also presented at the IASSIST conference in 2007. Instant access to information on a variety of devices meant that few would wait until the FRB had information posted on a website; the response, in collaboration with other central banks, was to create RSS-CB, a specification for central bank data. This was also a response to how FRB content was being “harvested” or accessed by automated processes as well as some “screen scraping” software that was used to pull the latest exchange rate or commercial paper rate from an HTML table. Instead there was developed an alternative format for human readers as well as for machines. The article shows details in examples of coding of the RSS with content like the exchange rate for the US dollar and the Mexican peso. Many international institutions are now producing RSS-CB feeds, and many are meeting in a Central Bank Online Communications group collaborating on a version 1.2.

The last article in the IQ 31-2 is authored by Lynn Woolfrey at the University of Cape Town. Her article outlines “The Establishment of the African Association of Statistical Data Archivests (AASDA)”. The introduction explains: “AASDA represents practitioners in survey data curation in Africa and was established to facilitate co-operation among them with regard to the development and use of best practices in the preservation and sharing of survey microdata in the region. This Association was established with the assistance of international organisations promoting optimal management of survey data. These included the International Household Survey Network (IHSN) and the International Association for Social Science Information Service and Technology (IASSIST).” We are naturally happy that IASSIST was found of help here and the article shows how IHSN was aware and could take action to improve the survey data production and utilization in developing countries. The focus on establishing a community of practice for sharing African data found realization when the AASDA held the inaugural meeting in April 2008.

Remember to take a look at the website http://iassistdata.org and the IASSIST blog – the IASSIST Communiqué – at http://iassistblog.org.

Articles for the IASSIST Quarterly are very welcome. Articles can be papers from IASSIST conferences, from other conferences, from local presentations, discussion input, etc. Contact the editor via e-mail: kbr AT sam.sdu.dk.

Best regards, Karsten

Karsten Boye Rasmussen, editor of the IASSIST Quarterly, kbr AT sam.sdu.dk
Marketing & Management, SDU, University of Southern Denmark

IASSIST 2009 Tweets!

March 3rd, 2009

We’ve created a Twitter feed for conference info, updates and impressions. See it at http://twitter.com/IASSIST2009 or  there’s a link on the program page.

Before the conference, it will be mostly logistic and planning information.  At the conference, IASSISTers will be tweeting about the conference itself:  comments, suggestions, updates, and other twitter-friendly information.
We’ve got a few volunteer tweeters but more are always welcome.

If you are interested in tweeting from the conference, contact Tuomas Alatera who will set you up with a HootSuite account so you can help us Twitter.  If there are any questions about content or other nontechnical issues, contact San Cannon. (Don’t have the our email addresses? Send a note to iassist2009@congreszon.fi and it will find us!)

Let the Tweeting begin!

Registration for the IASSIST 2009 is open

February 24th, 2009

Tervetuloa Tampereelle,
Welcome to Tampere!

Registration for the 2009 IASSIST conference is now officially open. On the conference web site there is more information on registration, accommodations and excursions.

The Early Bird rate is valid until March 24, 2009. Please note that a deposit of one night for accommodation should be paid together with the registration fees. Cancellations (conference attendance, hotels, tours) on or before April 24 will receive a refund minus the administrative
fees. Once again, further details available on the registration page.

Check also the conference programme and programmes for posters and workshops for more details on what just may turn out to be the best conference ever!

http://www.fsd.uta.fi/iassist2009/index.html

If you have any questions or special requirements, especially concerning the registration process or payments, please contact our conference secretariat at Congreszon Ltd directly (iassist2009@congreszon.fi), before confirming your registration.

RSS for conference website updates available. If in doubt, do not hesitate to contact us.

On behalf of Local Arrangements 2009,

-Tuomas

Tuomas J. Alaterä
Finnish Social Science Data Archive (FSD)

Data Walkabout

February 23rd, 2009

A series of posts on the DataShare blog describing the interest and action in Australian University Libraries around new forms of support for data management, from an IASSIST member’s study tour of New Zealand and Australian institutions in January 2009.
Robin Rice
Project Manager
DISC-UK DataShare project

New Data Management and Sharing Guidance

February 4th, 2009

The UK Data Archive would like to announce the release of its new suite of web pages providing guidance on data management and sharing. The pages provide data creators, data managers and data curators with best practice strategies and methods for creating, preparing and storing shareable datasets. Advice has been divided into a number of key areas or modules providing detailed information on each topic. These are:

· Sharing data – why and how?
· Consent, confidentiality and ethics
· Copyright
· Data documentation and metadata
· Data formats and software
· Data storage, back-up, and security

In addition to these web pages, bespoke advice and training are provided by UKDA staff throughout the lifespan of a research project – from the initial award application, through the data creation phase to the end of the active research phase and to the final storage and dissemination of data. A comprehensive frequently asked questions web page is also available.

See: http://www.data-archive.ac.uk/sharing/

UKDA is currently undergoing a major rebrand of its promotional materials and web site. The data management and sharing web pages are the first section to reflect this new look. Feedback from users of the UKDA web site would be very welcome.

Regards
Sharon Jack
Senior Web Resources Officer
UK Data Archive – a service provider for the Economic and Social Data Service (ESDS)