Data Services

Clientele

Data Services has a campus-wide mandate to serve data teaching and research requirements. In practice, and due to limited resources, we serve mainly the Faculties of Arts and Commerce. There are, however, regular as well as occasional users in many other faculties, particularly in Agricultural Sciences and Education, and in the Health Sciences.

Current areas of collecting

The Data Services collection covers Canadian and U.S. socio-economic and financial time-series, international economic indicators, micro- and macrodata of relevance to Commerce, Economics, Political Science and Sociology. Special emphasis is on all Canadian Census data, and all Canadian Institute of Public Opinion (CIPO) Gallup polls; other Canadian survey series such as the General Social Survey, the Labour Market Activity Surveys and Surveys of Family Expenditure. GIS is a new and expanding area of coverage.

Research and publishing characteristics

For Data Library purposes, machine readable data files (MRDFs, data files, or data sets) are defined to include numeric, full text, image and similar data in digital form. Numeric files may contain micro- or macrodata, and may be rectangular (flat), time-series, cross-sectional or longitudinal files. Bibliographic or catalogue-type files are not included, nor is computer software of any kind. The Data Library acquires primarily numeric files.

Increasing numbers of MRDFs are being produced in many subject areas, in a variety of physical formats. At universities more and more quantitative analysis is being performed in business schools, in social science, science and medical faculties. Full text analysis is becoming increasingly popular in the humanities, and image data are used in many disciplines.

MRDFs are frequently difficult to identify and locate since they are not usually distributed through the conventional commercial channels. They are often created at academic institutions for local research purposes or by government departments as policy planning tools, and are subsequently made available for sale to other interested researchers.

Data files are distributed on media such as magnetic tape, personal computer disks, or CD-ROM disks. Distribution of files on the Internet is, however, growing rapidly and seems set to become the primary medium of dissemination.

Data may be distributed in ‘raw’ format (no value-added component) or in ‘packaged’ format (complete with retrieval software). Raw data are most commonly distributed on magnetic tape or via the Internet and tend to be produced by universities and government departments, whereas ‘packaged’ data come on floppy or CD-ROM disk and are often marketed by the private sector. The Data Library acquires raw data files only. Packaged files are acquired by the appropriate subject division in the Library.

In general, the Data Library acquires files which can be made available both via the campus network and to everyone at the University. CD-ROM and PC-based databases are thus generally excluded. Increasing numbers of numeric files are available for direct searching on the Internet. Through its gopher server (accessible through ViewUBC), the Data Library maintains direct connections to a number of computing sites housing such files.

Codebooks: Every data file requires a codebook for its interpretation. The codebook contains details about the structure and contents of the file and relevant background materials. It is impossible to use a raw data file without access to the codebook. For this reason two copies of all printed codebooks are acquired: one is anchored, the other circulates. The anchored set of codebooks is housed in Data Services; circulating copies are soon to be moved into the Koerner stacks.

In the case of computer-readable codebook files, the Data Library generally produces one printed copy (anchored). The codebook file is made available for users to access online or print their own copies.

Form

Machine readable data files.

Languages

Primarily English (applicable to codebooks only).

Geographic origin

All significant Canadian data files (especially microdata) for which there is demonstrated demand (Canadian being defined as ‘having Canadian content’). U.S. and international data files acquired selectively, as funds permit.

Collections in other UBC Libraries/ Areas of overlap

David Lam Library: CD-ROM products containing financial time-series.

Koerner Library: Canadian census and other government data. Books on survey design, data collection and analysis, and statistical analysis software.