ABSTRACT: The relationship between geography and cancer incidence and treatment is a critical area of health outcomes research. Geographical information systems (GIS) are software packages designed to store and analyze data related to geographic locations. Although more commonly associated with the social sciences and urban planning, the use of GIS software in medical research has been increasing. Moreover, since the 1999 establishment of the Geographical Informational Systems Special Interest Group (GISSIG) at the National Cancer Institute, oncology has been at the forefront of GIS-related health research. In this review, we discuss the potential applications and limitations of GIS software in oncology research. Our aims are to help clinicians and policy makers interpret studies generated using GIS, and to help clinical investigators implement GIS in future research.
The relationship between geography and cancer incidence and treatment is a critical area of health outcomes research, and Geographical Informational Systems (GIS) is a tool increasingly used for research in this area. GIS software programs can describe the geographic distribution of oncology care. GIS can effectively evaluate the supply of treatment resources within a given area relative to cancer prevalence and, more importantly, monitor for potential geographic variations in cancer outcomes and highlight potential disparities in cancer care. Because of this, GIS is becoming increasingly relevant in policy-oriented research focused on optimizing limited oncology resources within large underserved areas. This review describes the development and technical capabilities of GIS, potential applications of GIS in cancer research, and the limitations of such work.
Development and Capabilities of GIS Software
Developed in the 1960s by Dr. Roger Tomlinson of the Canadian Department of Forestry and Rural Development, the first GIS was originally constructed for surveying and development in rural parts of Canada. The original program, known as "Canada Geographical Information System" (CGIS), eventually grew to encompass datasets that spanned the entire country and became a useful tool in resource planning and management. By the 1970s, universities and government organizations around the world had developed alternative GIS programs, and GIS-based research emerged as an independent multidisciplinary field. In 1982, as personal computer use began to increase, the Environmental Systems Research Institute (ESRI) developed the first commercially available GIS package, known as ARC/INFO. The advent of commercially available GIS packages drastically increased the use of GIS worldwide. Users began to create open-use, publicly editable map data. The influx of map data into the public domain has only increased in recent years with the advent of new GIS technology and has allowed GIS to permeate many research fields.
The functional capabilities of GIS software are a combination of modern cartography and database management. GIS programs are traditionally comprised of at least three functional components. First, GIS software permits users to input data that corresponds to a geographic location. Second, GIS software enables users to create maps to visually display integrated georegistered data. Third, GIS has database capability that allows users to store and manipulate entered data and maps.
Although found commercially in a variety of different software packages, our discussion of the technical aspects of GIS software will be limited to ESRI's ArcGIS. ArcGIS is the GIS software most widely used in health services research; it is used by more than 300,000 organizations worldwide, including most federal agencies, all 50 United States health departments, and over 24,000 state and local governments. Data is stored in ArcGIS using shapefile packages. Shapefile packages are storage formats that house geographic location and associated attribute data. For example, a standard shapefile package could contain cancer incidence data, organized by county within the United States.
Regardless of GIS software type, the functional capabilities of GIS software can be effectively used in many areas of health services research. In addition to the ability to store and display regional data, several other functionalities of GIS software are worth noting. GIS software allows users to create their own maps and geographic units that can be tailored to more accurately describe healthcare patterns. Examples of this method in practice are maps developed by the Dartmouth Institute of Health Policy and Clinical Practice describing Hospital Referral Regions and Hospital Service Areas. GIS software also allows users to freely aggregate data between different geographic units of analysis. For example, users can combine the individual cancer incidences within counties to estimate the cancer incidence of an entire state.
Finally, GIS software allows for the quantitative analysis of geographic patterns through "spatial analysis." Prior to the advent of GIS software, mapping in medical research was mainly a qualitative examination of data. In "spatial analysis," GIS software allows users to find statistically significant geographic relationships. GIS can employ spatial autocorrelation to find statistically significant geographic clustering of a variable. For example, a user could employ spatial autocorrelation to test whether there is statistically significant clustering of cancer incidences among neighboring counties.  Additionally, GIS can calculate geographically weighted regressions (GWR) to evaluate spatial heterogeneity among independent and dependent variables. Finally, GIS can employ spatial interpolation to estimate the geographic distribution of a variable within a region given the geographic distribution of the variable in surrounding regions.