Internal links to the defintion list (indicated with an *) provide more
in-depth explanations of the terms used in this paper.
Introduction:
In recent years, Geographic Information System (GIS*) technology has become more universal. The
appearance of desktop, icon-based products has meant that the general
population will be more likely to use geospatial
data* and GIS technology (Strasser, p. 278). But even as the software
becomes easier to apply, GIS users are faced with the greater obstacle of
obtaining the digital formats of the spatial data they require to perform
their work (Strasser, p. 281).
The goal of my Digital Library Associate (DLA) project is to encourage the use of geospatial data by facilitating access to certain digital datasets at the Univeristy of Michigan Map Library. Specifically, I created a simple interface* for using a GIS to convert Michigan land use data* into a GIS-ready* format. The program I created, called the Michigan Land Use Data Automation and Dissemination Project (MLDADP), displays a menu-interface from which Map Library patrons can customize data selection. (For a visual example of the menu interface, see interface images.) MLDADP serves as a model for two levels of application: enhancing general access to geospatial data at the University of Michigan, and promoting public access (for non-GIS experts) to this information on a larger scale, through the Internet.
Finding published material which discusses "access to digital spatial
data" is very difficult. The amount of digital information available on
the Internet is constantly expanding, and at a rapid pace. As Bergen
notes (p. 306), the World Wide Web's interactive capabilities make it an
appropriate forum for data distribution. Consequently, the medium of the
Web is the most useful repository for networking geospatial data and for
obtaining information about this process.
Examples of Web sites providing access to geospatial data.
There are both individual and collaborative efforts to provide networked
access to geospatial data. Individually, some academic institutions,
government agencies, and private corporations all provide some level of
GIS or mapping service. New sites appear rapidly, and because they offer
such varying levels of access to geospatial data, it is almost impossible
to generalize their services. Provided below are a few examples of the
variety of sites that provide access to geospatial data. For a more
complete look at the kinds of geospatial data available on the Internet,
please refer to Starting
the Hunt: Guide To On-line And Mostly Free U.S. Geospatial and Attribute
Data. This excellent resource for locating geospatial information on
the Web is organized by data subject, physical region, and many other
categories. Although it is not searchable and it does not provide metadata*, it is easier to use than most other
catalog sites.
Many sites, including those below, offer a wide array of geospatial data
services, from cataloging and searching geospatial metadata, to providing
map-creation services, or allowing users to download GIS-ready datasets.
Pseudo-GIS services, like the mapping service from CIESIN, allow users to download an
image, but not the data that generates the coverage. These applications, although needed, do
not provide access to GIS-ready datasets. Mapping and metadata sites,
therefore, will not be addressed in this paper. Focusing instead on sites
that provide access to GIS-ready data, I have found that they usually
suffer from one of the following failings:
What role is the government playing?
The National Spatial Data
Infrastructure (NSDI), mandated by the Clinton-Gore administration, is
leading a collaborative effort to collect and provide access to geospatial data. According to its Web site, the
NSDI "is conceived to be an umbrella of policies, standards, and
procedures under which organizations and technologies interact to foster
more efficient use, management, and production of geospatial data." Under
an Executive Order of the President, beginning January 1995, all
government agencies are required to document and make their digital data
available to the public (Nebert 1).
The Federal Geographic Data Committee
(FGDC) is providing leadership in the development of the NDSI, working
together with government agencies, nonprofit organizations, and the
private sector. Three key components of the FGDC's work are to:
What is MIRIS data?
The Michigan Resource Information System (MIRIS) is an effort to create a
"statewide computerized database of information pertinent to land
utilization, management, and resource protection activities" (DNR, pp.
1). This effort was intially known as the Michigan Resource Inventory
Program established under the Michigan Resource Inventory Act, 1979 PA
204.
MIRIS data includes a considerable amount of information about land and
water resources in Michigan. The information is organized in a variety of
formats. The University of Michigan has had access to the print versions
of the base maps and landcover features since the late 1980s. Base maps
were digitized primarily from U.S. Geological Survey (USGS) 7.5"
quadrangles and contain the following features: state/federal highways,
county roads, local streets, vehicular trails, railroads, airports/landing
fields, lakes, perennial/intermittent streams and drains,
county/township/city boundaries, and U.S. Public Land Survey section
corners, lines and numbers. Land Cover maps describe land uses in seven
major categories: urban, agricultural, nonforested, forested, water,
wetlands, and barren. Each of these classifications has two to twenty
subdivisions, with the greatest number of categories belonging to the
urban classification.
The MIRIS data was compiled by numerous federal, state, and local agencies
and therefore the scale and accuracy are not consistent for all the coverages*. MIRIS and other mapping agencies
periodically review and correct inconsistencies as new editions of the
data are released. The University of Michigan has permission to use and
distribute the MIRIS data, and now participates in the IMAGIN datasharing
program (Improving Michigan Access to Geographic Information Networks).
Land use files for the state of Michigan are stored by township, a smaller
division of counties developed nationally in the 18th century. Washtenaw
County, for example, is made up of twenty townships (DNR).
In the midst of the conversion stages, I realized that what users need is
not only enhanced access to digital geospatial data, provided in part by
current interactive Web sites, but the ability to customize the needed
data for their own purposes. As discussed above, several government and
private sector web sites attempt to provide online access to geospatial
data, but do so without giving the user the ability to customize his or
her request. This static method of disseminating data may lead to users
having to download much more information than they need. Perhaps a user
only needs a fourth of the coverage provided by the Census Bureau; once
obtained, polygonal coverages* are not easy to
manipulate. One cannot just cut out the unwanted sections.
To experiment with user customization, the focus of my project switched
from trying to provide Web access to MIRIS data to adding user
customization options to the conversion and transfer process. I wanted to
create an expert system which would limit the number of steps users have
to perform to get to the data they need, and to make the complicated ARC/INFO* processes transparent. I wrote a script
creating a menu-driven GUI* through which users
are able to select which townships they want converted to ARC format. I
then composed another series of scripts automating the conversion process
so that desktop GIS users need not learn ARC/INFO (a more difficult to
use, command-line application) in order to convert the necessary data to
the correct projection. The data are then readily moved into a simpler
GIS package, ArcView, where data can be manipulated and images moved to
the Web.
The successful implemenation of this project will undoubtedly lead to the
development of other projects which make use of the knowledge gained, both
on a local and national level. For ideas of further development, see
Future Plans, below.
The implementation of MLDADP occured in a series of stages. From the
initiation of this project to its completion, my purpose and product
changed dramatically. My original intent was to convert the MIRIS data to
ARC format, and then provide Web access to this data. My focus later
changed to customizing the process of of data retrieval by allowing users
to choose which townships they wanted converted. To complete this goal, I
developed a menu interface and composed scripts to automate the
complicated process of conversion. In essence, I created an expert system
which allows the user flexibility to request what she needs while making
invisible the complex ARC/INFO processes behind the conversions. (See Interface Images.)
Initial Vision
At the outset, I had planned to simply convert the MIRIS data for
Washtenaw County from IGDS to ARC format, store all this data in a table,
and allow users to access it from the World Wide Web. I experimented with
storing tables on a web page and downloading them from different
platforms.
Karl Longstreth, head Map Librarian at the University of Michigan, and
technical supervisor of my project, quickly showed me that the the
conversion to Arc format was only the first of many steps necessary to
make the data usable. Longstreth suggested I contact John Fay, GIS lab
manager at the School of Natural Resources and the Environment (SNRE), who
had successfully converted all of the state MIRIS data from IGDS to ARC
format. Fay shared an AML* script he had
written to perform the conversions. This script later became the template
for all the other scripts I composed.
Once I had gotten Fay's script to perform the IGDS to ARC conversions for
Washtenaw County, Longstreth and I began experimenting with and
formalizing the steps necessary to complete the process. Through several
weeks of trial and error, we found a series of steps which led to a
completed, usable coverage of Washtenaw County. These steps included (ARC
commands given in parentheses):
Changing the focus.
Once I had formalized the steps necessary to convert all of the Washtenaw
County data, I was ready to move on to the rest of the state. My focus
changed, however, based on a visit from a library user. A patron came
into the Map Library and wanted to use MIRIS data for four adjacent
townships in Washtenaw County. Once all the county townships have been
merged into a single coverage, it becomes very difficult to isolate the
needed townships. This is because land use is represented by polygons,
which cross over political boundaries such as township lines. After the
township lines have been eliminated, separating townships would result in
pulling apart polygons (imagine pulling puzzle pieces apart), rather than
dividing the coverages along neat, square lines. At that stage, we were
only able to offer the data in an all-or-nothing package, as land use for
the entire county.
In order to provide the needed data I retraced my steps, this time for the
four townships rather than the entire county. This sparked the idea that
I should generate a script that would automate the entire process,
incorporating the conversion of the data from IGDS to Arc format, join
relevant coverages, dissolve borders, change coordinates to decimal
degrees, and rebuild topology. This way, patrons could indicate (through
a web or Arc interface) which county or township data they needed, and the
ARC/INFO conversions would be processed automatically.
The benefits of this decision were instantly obvious. Users need
flexibility; my criticisms of current web sites focused on the problem
that users had no option to customize the data they receive. What I
realized from looking at current Web access to geospatial data, and
through my own work on this project, was that in order to obtain a high
level of customization, one cannot offer pre-defined datasets. Instead,
the access system must serve as a front-end to a GIS that customizes data
on the fly.
Another benefit of MLDADP is that it solves problem of inefficient
storage. The coverage for the entire county is a very large file (12Mb).
An expert GIS user would not have to oversee the conversion of data for
the entire state. Neither would large amounts of converted and
underutilized data be taking up server space. Instead, townships and
counties are processed as needed.
Writing the scripts.
ARC/INFO has its own programming language, Arc Macro Language (AML), which
allows users to write programs that automate routines. Using Fay's
initial script as a template, and working through the AML manual on my
own, I was able to construct a series of scripts that automated the
conversion process described in the Methodology section, above.
In the current incarnation of this project, the user has to interface with
ARC/INFO only once, to initiate the program. Once begun, the user
manipulates the generated menu to customize his or her data needs. The
necessary townships are processed by ARC/INFO (according to the
methodology described above) and the converted coverage is stored in
networked space, accessible by the patron from a desktop GIS program on a
remote computer.
Initially, I saw two obstacles to constructing a working script. When I
first ran Fay's script to convert the townships of Washtenaw County, I had
to enter all the township file names into a text document. I told the
script to use this file so that it would know which townships to process.
Giving users control over which townships to process meant that the list
of file names would change for every user. The first obstacle I faced was
how user input could determine and change the list of needed townships and
how that file could be read from within the script.
My second concern was how to write a script for the MAPJOIN command. The
question here is how to read in a list of townships to be appended when
the system doesn't know how long the list is or if the coverages are even
adjacent. I later discovered non-adjacent townships would not upset the
MAPJOIN process. Apparently, non-neighboring townships can still be
combined into a single coverage.
The trial and error process is a necesary part of composing scripts, since
many AML commands are not extensively documented. Through this
experimentation process, I discovered that there was a single solution to
these two concerns. I created a dummy file, which opens with the script's
initiation. Selecting townships from the menu results in that particular
township's raw file name being written to the dummy file. Ending the
selection process closes the dummy file. The file is then opened, read to
retrieve township file names, and closed each time a sub-script runs. At
the end of the entire process, the dummy file is deleted so that a fresh
copy can be used the next time the script is run. (For a more detailed
explanation of how these routines operate, see the AML scripts written to perform these tasks.)
Ideas for future development of MLDADP include:
Findings include:
Future studies for MLDADP include testing the efficacy of this model on a
larger scale. Its interface should be tested and expanded, within the
campus setting, and in the realm of the Net.
Ideas for future studies can be organized under three main topics:
The completion of MLDADP is well-timed, since all federal agencies are now
required to offer information over the Internet. (Nebert 1) This project
will remind those agencies working with geospatial information that the
need for GIS-ready data is very real and should be encouraged. Because
network access to federal government data will eliminate human assistance
these sites must offer clear and simple interfaces to their information.
The work I have done on MLDADP is not narrow in its scope. It has
provided a model technical soution for encouraging patron use of the
digital geospatial data at the Map Library. It has created a seed project
from which the University may initiate several new methods of providing
access to digital data. More importantly, however, MLDADP will serve to
remind other data providers to refocus on providing access -- to
keep their users' needs at the forefront and offer customization features
whenever possible. Their users will be happier, and their data will be
better utilized.
Finally, MLDADP has bridged the GIS technology gap between the GIS experts
or academicians and the community leaders or public citizens whose use of
GIS might lead them to better decision-making practices. By simplifying
access to GIS-ready data, this project gives desktop GIS users access to
the same data available to more technologically-privileged GIS users.
Putting GIS power in the hands of the public will offer many benefits: it
will encourage the market for GIS products and data, and it will give
community leaders better decision-making tools.
Project Background:
Review of the literature.
The follwing examples illustrate the seeming inability to combine a simple
interface and customized access to geospatial information.
The amount of geospatial information available through the Web is
daunting, and so are the technical solutions to making it available.
However, in the midst of putting up their information, it appears as if
some providers have forgotten that user interests should come first. As
GIS applications become more prevalent, data providers need a formula for
providing users with both a simple interface and a method of customizing
data retrieval.
The NDSI provides access to three types of geospatial information:
catalogs of metadata, maps, and digital spatial datasets. According to
Nebert, most users of these Internet resources will download map images
rather than digital datasets, "because [the users] lack the GIS/mapping
tools to render thier own maps from raw data." (Nebert 3). Nebert's
assumption is unfounded, and detrimental to the development of methods for
accessing raw digital geospatial data on the Web. Within academia and the
private sector, there is a growing clientele for the raw data used in GIS
applications, and the current state of access does not meet this
clientele's needs.
Definition of the Problem:
There are many problems associated with finding and accessing digital
spatial data, some of which are listed below:
Although a lack of technological solutions may be the greatest inhibitor
of network access to geospatial data in its raw form, this problem will
continue to be placed on the back burner as long as people assume that the
need for GIS-ready data is small. As Nebert notes, conventional wisdom
suggests that geospatial data users do not have access to GIS or mapping
tools, and, therefore, need access to maps and other images they are not
able to create on their own. Until the deficiency of raw, GIS-ready data
is understood, the development of geospatial collections will be slow and
inconsistent.Aim of Study: How MLDADP will help solve these
problems
The purpose of this project was to develop a strategy to encourage the use
and transfer of spatial data. As a case study for increasing access to
geospatial data, I initially intended to provide Web access to MIRIS land use data for Washtenaw County which the
University Map Library has permission to disseminate. The Map Library
received this data in IGDS format. This format,
however, is not universally accepted, and rendered the data unusable for
mapping and analysis with the applications available at the University.
In order to be useful, the MIRIS data required complicated conversions to
change its format and projection*. Project Benefits:
This project benefits the University community on several levels.
MLDADP benefits are not limited to the University community. Simplifying
access to GIS data puts this techonology in the hands of the community
leaders and public citizens, where before it was restricted to the realm
of the researcher or GIS expert. Increasing and simplifying access to this
data will promote GIS use by allowing community leaders to put GIS
technology to work for the community at large.
Methodology:
Overview
Future Plans for MLDADP:
The creation of this automated system has made explicit some of the
questions and issues involved in providing access to digital spatial data.
Yet, its development is incomplete. Writing the scripts and creating the
menu interface was not a static intellectual process; I continue to
develop ideas for improving this project. Some of these ideas would
serve, initially, only the project itself, making MIRIS data more
accessible and offering the converted data in a greater variety of
finished products. Other future plans for the project contribute to the
collaborative efforts of the Federal Geographic Data Committee by
participating in the National Clearinghouse and helping formalize national
metadata standards.
Findings/Results:
Through the conversion and refining of the Washtenaw County MIRIS data, I
discovered the technical difficulties involved in the conversion process,
described above. As each obstacle was overcome, I was able to develop
routines (scripts) to more efficiently serve patrons' needs. This project
has been extremely useful because it serves current patron requests for
land use data, it provides a model for future development of digital data
at the Map Library, and it offers some intellectual criticism of the
current state of digital access.
Next Steps and Future Studies
Conclusions
This project is important to encouraging access to GIS data. My
investigation into how private sector and government sites are providing
network access to geospatial data has shown that there is much room for
improvement. Data providers cannot assume GIS applications are too
complex for the general Internet user, because doing so will only
discourage their use. Instead, GIS proponents must realize the importance
making quality data accessible over the Net. In order to promote our own
work, we must encourage widespread use of the Internet as a means of
collecting and transferring geospatial data. Before the Internet can be
an efficienct means of accesing geospatial data, technological
developments must occur on two fronts: making GIS applications easier to
use, and making the data necessary to these applications available.
Furthermore, these improvements must happen synchronously, since one
encourages the other. Appendices:
Last Update 12/17/96
Document URL: http://www.iit.edu/~atkins/DLA/
Copyright 1996, Alison Atkins