|
small (250x250 max)
medium (500x500 max)
large ( > 500x500)
Full Resolution
|
|
ISSN 1055- 1425
November 2008
This work was performed as part of the California PATH Program of the
University of California, in cooperation with the State of California Business,
Transportation, and Housing Agency, Department of Transportation, and the
United States Department of Transportation, Federal Highway Administration.
The contents of this report reflect the views of the authors who are responsible
for the facts and the accuracy of the data presented herein. The contents do not
necessarily reflect the official views or policies of the State of California. This
report does not constitute a standard, specification, or regulation.
Final Report for Task Order 5315
CALIFORNIA PATH PROGRAM
INSTITUTE OF TRANSPORTATION STUDIES
UNIVERSITY OF CALIFORNIA, BERKELEY
A Statewide Optimal Resource Allocation
Tool Using Geographic Information Systems,
Spatial Analysis, and Regression Methods
UCB- ITS- PRR- 2008- 27
California PATH Research Report
Konstadinos G. Goulias, Thomas F. Golob,
Seo Youn Yoon
CALIFORNIA PARTNERS FOR ADVANCED TRANSIT AND HIGHWAYS
A Statewide Optimal Resource Allocation Tool Using
Geographic Information Systems, Spatial Analysis,
and Regression Methods
FINAL REPORT
Konstadinos G. Goulias
Department of Geography & GeoTrans Laboratory
University of California Santa Barbara
Santa Barbara CA 93106
805- 284- 1597
Goulias@ geog. ucsb. edu
Thomas F. Golob
Institute of Transportation Studies
University of California Irvine
Seo Youn Yoon
Department of Geography & GeoTrans Laboratory
University of California, Santa Barbara
Project: PATH Task Orders 5110 & 6110
A GIS- based Tool for Forecasting the Travel Demands of Demographic Groups within
California – An Optimal Resource Allocation Tool
October 2008
Santa Barbara, CA
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
1
K. Goulias, T. Golob, and S. Y. Yoon
Table of Contents
Executive Summary ............................................................................................................................... ......... 2
1. Introduction ............................................................................................................................... ............. 8
2. Background ............................................................................................................................... ........... 10
3. Optimality Assessment ........................................................................................................................... 15
4. Inequality Assessment ............................................................................................................................ 25
5. Microanalysis ( Person Based) Analysis ................................................................................................... 39
6. Microanalysis Using Regression Models .................................................................................................. 45
6.1 Adults Who Do Not Drive ......................................................................................................................... 46
6.1.1 Census Tract Model ........................................................................................................................... 47
6.1.2 Comparison with Block Group Model .............................................................................................. 49
6.2 Transit Usage by Any Household Member ................................................................................................ 51
6.2.1 Census Tract Model ........................................................................................................................... 51
6.2.2 Comparison with Block Group Model .............................................................................................. 54
6.3 Transit Usage by an Adult Driver in the Household .................................................................................. 58
6.3.1 Census Tract Model ........................................................................................................................... 58
6.3.2 Comparison with Block Group Model .............................................................................................. 59
6.4 Nonmotorized Travel by Any Household Member .................................................................................... 64
6.4.1 Census Tract Model ........................................................................................................................... 64
6.4.2 Comparison with Block Group Model .............................................................................................. 65
6.5 Nonmotorized Travel - by an Adult Driver in the Household .................................................................... 71
6.5.1 Census Tract Model ........................................................................................................................... 71
6.5.2 Comparison with Block Group Model .............................................................................................. 75
6.6 High Occupancy Vehicle ( HOV) Demand ( Driving with Anyone as a Passenger) ................................... 77
6.6.1 Census Tract Model ........................................................................................................................... 78
6.6.2 Comparison with Block Group Model .............................................................................................. 81
6.7 Adult Driver as a Passenger in an HOV ..................................................................................................... 84
6.7.1 Census Tract Model ........................................................................................................................... 84
6.7.2 Comparison with Block Group Model .............................................................................................. 84
6.8 Adult HOV Passenger Travel Time ........................................................................................................... 87
6.8.1 Census Tract Model ........................................................................................................................... 87
6.8.2 Comparison with Block Group Model .............................................................................................. 90
6.9 Solo Driving Demand - Household Solo Driving ...................................................................................... 92
6.9.1 Census Tract Model ........................................................................................................................... 92
6.9.2 Comparison with Block Group Model .............................................................................................. 93
6.10 Adult Solo Driving Time ........................................................................................................................... 97
6.10.1 Census Tract Model ........................................................................................................................... 97
6.10.2 Comparison with Block Group Model .............................................................................................. 98
7. Models Combining Sociodemographics and Spatial Variables from Tracts and Block Groups .................... 102
7.1 Nonmotorized Travel by any Household Member ................................................................................... 102
7.2 Nonmotorized Travel by an Adult Driver in the Household .................................................................... 104
7.3 Adult HOV Passenger Travel Time ......................................................................................................... 107
8. Summary and Conclusions ................................................................................................................... 109
9. Next Steps ............................................................................................................................... ........... 114
References ............................................................................................................................... .................. 115
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
2
K. Goulias, T. Golob, and S. Y. Yoon
Executive Summary
The overall objective of this project is to develop an optimal resource allocation tool for the
entire state of California using Geographic Information Systems and widely available data
sources. As this tool evolves it will be used to make investment decisions in transportation
infrastructure while accounting for their spatial and social distribution of impacts. Tools of this
type do not exist due to lack of suitable planning support tools, lack of efforts in assembling data
and information from a variety of sources, and lack of coordination in assembling the data.
Suitable planning support tools can be created with analytical experimentation to identify the
best methods and the first steps are taken in this project. Assembly of widely available data is
also demonstrated in this project. Coordination of fragmented jurisdictions remains an elusive
task that is left outside the project. When this project begun we confronted some of these issues
and embarked in a path of feasibility demonstration in the form of a pilot project that gave us
very encouraging results. In spite of this pilot nature aiming at demonstration of technical
feasibility, substantive conclusions and findings are also extracted from each analytical step.
In this project we have two parallel analytical tracks that are a statewide macroanalysis
( called the zonal based approach herein) and an individual and household based microanalysis
( called the person based approach herein). In the statewide macroanalysis we study efficiency
and equity in resource allocation. Resources are intended as infrastructure availability and
access to activity participation offered by the combined effect of transportation infrastructure and
land use measured by indicators of accessibility. Stochastic frontiers are used to study efficiency
and a particular type of inequality measurement called the Theil fractal inequality index is used
to study equity in the macroanalysis. The outcome of this analysis are maps identifying places in
California that enjoy higher levels of service when compared to the entire state and places which
succeeded in allocating resources in a relatively better way than others. In the individual
microanalysis we use the accessibility indicators from the macronalysis and expand them by
defining a new set of indicators at a second level of spatial ( dis) aggregation. Then we use them
as explanatory factors of travel behavior with focus on the use of different travel models ( e. g.,
driving alone, use of public transportation and so forth). As expected infrastructure availability
and accessibility to activity opportunities has a significant and substantive effect on the use of
different modes. Many resource allocation decisions, then, will impact behavior, which in turn
influences the optimality and equity conditions. This implies that decisions about where and
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
3
K. Goulias, T. Golob, and S. Y. Yoon
when to allocate resources in public and private transportation needs to account for changes in
behavior in a dynamic fashion, using scenarios of accessibility provision and assessing their
impact by studying activity and travel behavior changes.
There are four distinct work tasks that we describe in this report. First, we assembled
statewide spatial US census data at two levels of nested geographic subdivisions that are the tract
level and the block group level and merged them with a highway network of the same vintage
( year 2000). Each subdivision is considered as a center and around each center we create travel
time and travel distance buffers. Within each buffer we compute the amount of persons working
in each industry ( retail, education, health, manufacturing, and all other activities) to represent the
spatial opportunities to participate in activities available to the residents of each virtual center.
We also count the number of facility kilometers to represent the supply of infrastructure.
Second, we use the data from the first task to study the ability of each area in providing
services to its residents and then we compare all these areas and rank them based on stochastic
frontiers, which is a regression method. We named this method the efficiency measurement
because it allows to link infrastructure provision ( as the input) to the accessibility offered ( as the
output). Stochastic frontier analysis captures and depicts the complex set of relationships among
highways and accessibility showing that providing more roadways is not always better for access
to opportunities. This happens either because of competition for space and/ or because the spatial
distribution of activity opportunities does not follow these roadways but obeys other spatial
distribution rules. The regression results also show that the role of roadways depends on the
measurement indicator used but also the presence of other surrounding roadways. Overall,
however, the presence of primary roadways has a strong positive impact on access. For core
access the secondary roadways seem to have a much higher impact and merit attention for
investment. Efficiency in the transformation of roadways to access depends on the residents of
each tract and depends on the measurement of access ( outer ring vs. middle ring).
Third, we demonstrate a method that identifies specific locations in the entire state where
resource allocation has succeeded in maximizing benefits to the public. Using a derived factor of
accessibility for the population residing in each block group an index for the entire state was
computed that measures the disparities in accessibility featured by the block groups in regard to
their population. This same index can thus constitute a first tool for policy makers who consider
equality as a criterion of allocation of infrastructure investment. Then we implement a fractal
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
4
K. Goulias, T. Golob, and S. Y. Yoon
( an index based on the nested spatial structure of counties, tracts, and block groups) inequality
index ( called the Theil index) that gives us a better understanding of the spatial distribution of
inequality throughout different geographical scales. This index gives information about the
disparities in accessibility between Counties as well as inside the Counties themselves. The Theil
index we implemented here constitutes a tool both easy to understand thanks to its intuitive
definition, easy to implement since it relies on data that are largely available, and able to give
instructive information about the structure of inequality in providing access to residents. It
shows which locations in California fail to be equitable and require their residents to travel
excessively to pursue the same amount of activities as residents of other locations where
travelling enables better time allocation.
Fourth, the wealth of the spatial indicators developed using information from census
tracts, census block groups, and the extensive roadway network in California were used as
explanatory variables in regression models of travel behavior. Each set of these accessibility
capturing variables affects different travel behaviors in different ways. Household density, retail
employee density and road infrastructure provided meaningful explanation of the variety in
travel behaviors we observe capturing the impact of different dimensions of accessibility such as
characteristics of residential area, availability of activity opportunity, and connectivity through
road infrastructure. From the model estimation experiments a variety of findings emerge. From
the comparisons between the census tract models and the block group models, we see that the
variables describing a behavioral aspect can show different levels and patterns of impact on
travel behaviors when they are measured using different areal unit sizes. To be more specific,
household density measured in census tracts explained better the behavior analyzed here than
household density measured using block groups. From the comparisons, we see that census
tracts, covering a larger area around a residence, capture the density impact in more informative
ways. However, this cannot be the golden rule for every travel behavior indicator. We need to
think about the implications that a specific areal unit has on each travel behavior indicator, test
its ability to explain behavior, and decide to use the one that is the most informative. In addition,
spatial variables involving shortest paths in computation showed better ability of discerning the
impacts of each spatial segment and also clearer impact patterns of each variable set when they
are computed using smaller unit areas than when they are computed using larger unit areas.
Smaller unit areas provide closer approximation of the variables and those variables seem to be
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
5
K. Goulias, T. Golob, and S. Y. Yoon
less susceptible to measurement error than variables computed using larger geographical units.
However, the trade- off between obtaining closely approximated explanatory variables and the
computational effort required for smaller areal units has to be considered when we decide which
areal unit we want to use. In fact, the improvement in the goodness of fit of some regression
models was marginal or even totally absent. Moreover, the two aggregation levels used here
have their own inherent advantages and disadvantages. Consequently, we also demonstrate
building models using spatial variables from both geographic levels with some clear benefits in
explanation and goodness of fit. Overall, however, land use density and supply of roadways are
strong and significant explanatory sets of variables and they provide a good candidate for linking
land use to travel behavior in policy impact assessments. In terms of efficiency and inequality,
the regression models show that even when investments are done to improve efficiency and/ or
inequality they will impact different behaviors in different ways and their overall impact may not
necessarily benefit individuals because different impacts in different facets of behavior may
counteract each other. The total effect on the overall daily travel patterns of individuals and
groups of individuals exceeds the scope of this project. The only tractable existing method to
track these impacts is microsimulation ( computer- based synthetic generation of activity and
travel patterns of individuals), which is gaining popularity among practitioners.
We believe this project was an immense success as a feasibility pilot study. Existing data
sources can be “ mined” to extract general useful indications about efficiency and inequality. The
same data sources can also be used to gain informative insights about travel behavior and to
begin unraveling the complex relationships between infrastructure investments and behavior.
Due to pragmatic considerations in the design of the tool presented here many limitations do not
allow this tool to be used immediately as a planning support system for statewide policy and
decision making. Early during the project design phase we discovered there was no
comprehensive clearinghouse of statewide information about transportation projects that tracks
them from their inception to the final implementation and impact assessment. Assembly of data
from a variety of sources to build a database of all the transportation projects and their impacts
would have exceeded the scope and time budget of this project. For this reason we approximated
infrastructure supply using an inventory of highways in an existing network database. Similarly,
we neglected accounting for public transportation facility and network supply. Moreover, we
use as highway speed the reported speed limit for each network link, which we know does not
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
6
K. Goulias, T. Golob, and S. Y. Yoon
represent prevailing speeds and varies throughout the day, days of the week, and many other
seasonal rhythms. These considerations point to one of the next steps, which is to create a
project that, on the one hand, builds a data warehouse of public and private investments and
related projects and, on the other hand, develops a statewide multimodal network that is updated
yearly with additions and added documentation about the quality of the infrastructure
components represented by the network. Technology to accomplish both steps exists but
institutional support is not readily available at this time.
The entire analysis was done using data from the year 2000. The data are from products
such as the Census Transportation Planning Package and a roadway network vintage 2000. The
household behavior data span a few months in 2000 and 2001. As a result all the analytical
findings are for that period and may not be extendable to other times. This analysis should be
expanded to include other years. Opportunities for new data are multiplying due to the American
Community Survey, which in 2010 will most likely release its 5- year estimates for areas with a
population of less than 20,000, including census tracts and block groups. This may provide an
unprecedented opportunity to study the evolution of accessibility in our state and identify the
places and their social and demographic groups that benefitted the most by pinpointing
geographic areas that increased or decreased residents’ accessibility. Comparisons between the
year 2000 and 2010 will reveal changes over time and identify areas in California that benefited
the most and areas that benefitted the least. If the project information warehousing activity
mentioned above is accomplished, we could also distinguish between successful and
unsuccessful projects using the tools and ideas in our project.
In the third major area of next steps we can expand the microanalysis to a more
comprehensive treatment of travel behavior. This includes activity participation and interactions
among household members, trip consolidation in the form of tours, and also the more traditional
analysis of trip making. In addition to offering a more detailed picture of the impact that
infrastructure and density of opportunities cause on travel behavior, this next step has also the
potential to improve the statewide transportation model maintained by Caltrans. This last area of
analysis is also a fruitful research direction in developing a next generation of land use
transportation integrated models. This is an active area of graduate student and faculty research
in the University of California Transportation Center ( www. uctc. net).
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
7
K. Goulias, T. Golob, and S. Y. Yoon
The tasks in this report involved researchers from the University of California Santa
Barbara ( UCSB) and University of California at Irvine ( UCI). The overall project principal
investigator is Kostas Goulias at UCSB. At Irvine Tom Golob with assistance from James
Marca extracted a travel behavior database from the California travel survey of 2000 and
estimated the first round of travel behavior equations utilizing US Census tract level accessibility
indicators. At UCSB Val Noronha and Bryan Krause converted network and US Census data
into usable variables at the tract level, computed a first set of accessibility indicators, and
developed maps in GIS. During the first part of the project and based on this work a variety of
issues were identified, solutions sketched by Kostas Goulias, presented at a series of
presentations, and finalized in the second part of the project. A second set of accessibility
indicators based on the US census block group data were then computed by Seo Youn Yoon and
Kostas Goulias at UCSB that also estimated a new set of travel behavior models. They also
estimated the stochastic frontier models used in efficiency measurement. Emmanuel Kemmel,
Seo Youn Yoon, and Kostas Goulias also developed the Theil index computations.
The first two sections of this report provide a brief presentation of the study background
and design. The third section provides a summary efficiency measurement and computations
using US census tract level data and a detailed road network as well as stochastic frontiers. In
the fourth section we show the inequality assessment using US census block group data and the
Theil computations. This is followed by the fifth section that shows distribution of past
allocation of road infrastructure across a variety of sociodemographic segments. The sixth
section is dedicated to a variety of model estimation experiments to show the impact of provision
of infrastructure and accessibility on travel behavior. This is followed by a seventh section that
demonstrates the use of spatial variables calculated at two different but nested geographic levels
and the benefit of using them jointly. In the last two sections we provide a brief summary and an
outline of three recommended next steps.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
8
K. Goulias, T. Golob, and S. Y. Yoon
1. Introduction
Optimal allocation of resources for infrastructure facilities is a critical issue in planning for
development but it is also a critical consideration for the every day life of travelers. In addition
to optimal allocation, equally important is also the distribution of benefits in terms of
infrastructure facilities ( stock) and related quality of service intended here as the ability to reach
desired destinations within an acceptable amount of time ( service). Different regions of
California have received over the years different levels of investment for private or public
transportation. The residents at each of these regions are also “ investing” time to travel from one
location to another. These are inputs to a production system that has many outputs including
local gross product ( e. g., regional gross product) and time allocated by the residents to activities
( e. g., time for paid work, time dedicated to leisure and so forth). Depending on local
circumstances each region is more or less efficient in maximizing the use of these stock and
service resources. Tools exist to judge how efficiently systems work but they focus on economic
efficiency and they do not incorporate a comprehensive measure of transportation stock and
service offered. Here, we emphasize social efficiency and bring measures of accessibility in the
arsenal of resource management and resource allocation to show the degree of efficiency
exhibited by different regions in enabling its residents to minimize personal costs and maximize
personal benefits. The research findings presented in this report contain a two- component
research program as mentioned in the preface above.
The state of California is divided into geographical areas and each is treated as a
production unit with its inputs represented by the different types of infrastructure ( e. g., lane
miles of roadways classified in a finite number of types). The outputs are indicators of the
service offered to the unit’s residents in terms of the amount of activities the residents of each
geographical area can reach. Figure 1 provides a summary of the schema used in this project.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
9
K. Goulias, T. Golob, and S. Y. Yoon
Figure 1 This project's schema
Stock of facilities
- Highways by type
Activity opportunities
surrounding each zone
Opportunities measured
by persons in
occupations in rings
Consider distance and
travel time
INPUT OUTPUT
Human capital
- Persons and households
- Household composition
- Car ownership
Activity and
travel behavior
Residents
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
10
K. Goulias, T. Golob, and S. Y. Yoon
2. Background
Typical studies of transportation investment and economic development are discussed in
Berechman ( 1994), Buffington et al. ( 1992), Perera ( 1990), Seskin ( 1990), and Weisbrod and
Beckwith ( 1992). There are also regional studies addressing the impact of transportation
infrastructure on local regional economic development. Assessment of these investments is
based on the Gross Domestic ( Regional) Product or private output as in Allen et al. ( 1988) and
Wilson et al. ( 1985), benefit- cost ratios and/ or differences as in Buffington et al. ( 1992) and
Weisbrod and Beckwith ( 1992), property values as in Palmquist ( 1982) and new business
creation or location as in Hummon et al. ( 1986). Analytical methods in these studies include: a)
assessment of the effects of transportation infrastructure investments that compare and contrast
the effects of investments among different regions; and b) identification of the important factors
that influence and enable economic development. The study here belongs to the first group of
analytical methods. Identification of the impacts from transportation infrastructure investment is
particularly important when resources are scarce. From the perspective of decision makers, need
assessment and accurate measurement of this need allows effective budgeting and financing of
projects. It also allows for informed decisions while evaluating individual projects, balanced
distribution of resources, and increased efficiency. Considerable research exists in the analysis
of investment and optimal allocation of resources. Transportation improvements influence
economic development, productivity, and social welfare. “ Pure” economic development impacts
are usually regional in nature and result from improved access to labor pools or to larger
markets. While considering the economic development of different regions of a country,
investment in transportation infrastructure as well as in the overall infrastructure system may
play significant role in removing regional economic disparities. Within the same country and
under the same development policies, significant role for transportation implies that regions with
better transportation infrastructure will have better access to the locations of materials and
markets making them more productive, competitive and hence more successful than regions with
inferior transportation accessibility. Better accessibility and mobility also plays a significant role
in human resource development of a region. For a review and an application using Data
Envelopment Analysis, see Alam et al. 2004, an example of longitudinal analysis Alam et al.
2005, and a Stochastic Data Envelopment Analysis see Alam et al., 2008, and project by project
economic assessment in Gkritza et al., 2008.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
11
K. Goulias, T. Golob, and S. Y. Yoon
One could make similar arguments when considering the time expenditures of individuals
and households to paid and unpaid work as well as free time with family and friends. However,
transportation investment from a “ social efficiency” viewpoint is absent from transportation
practice. This is mainly due to the lack of tools capable to assess the role of transportation
investment on the efficient allocation of time by the residents of each locality. The tool we aim
with the analysis presented here identifies specific locations in the state where resource
allocation has succeeded in maximizing benefits to the public. In addition, we aim to develop
maps that show which locations in a state fail to be optimal and require their residents to travel
excessively to pursue the same amount of activities as other residents of different localities.
More specifically in this report, we answer four key questions:
Using largely available data, can we develop a small number of variables to describe access
to activity opportunities for California residents?
Are more roadways improving access to these activity opportunities?
Are these roles different for different types of highways and how?
Can we identify roadways that are prime candidates for investment?
In this analysis the state of California is divided in 7049 zones using the US Census 2000
tracts. The Census tract ( unit of analysis here) is selected as a first order geographical
subdivision to make the analysis tractable at the state level and to provide sufficient detail to be
meaningful ( we will repeat this analysis with a smaller geographic unit and revisit this aspect in
the conclusions). We assess each tract in terms of its ability to produce benefits for its residents.
Figure 2 provides a schematic representation of the study and Table 1 contains a selection of unit
of analysis characteristics.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
12
K. Goulias, T. Golob, and S. Y. Yoon
Figure 2: Computation Schema of the Study
Envisioning each tract as a production unit and developing for each tract a production
function, we measure access to opportunities, treat them as outputs, and correlate them to the
presence of roadways within and surrounding the tract. Access to opportunities for activity
participation ( e. g., leisure) and services ( e. g., health) is the benefit ( and output) from each tract
that we will assess. Using Geographic Information Systems we compute for each tract the
amount of activity opportunities reachable within 5 km, 5 to 10 km, and 10 to 50 km. We repeat
the same for 20 minutes and 20 to 40 minutes travel time computed using information about
speed limits on the roadway network at hand. Computation of these measures is accomplished
by developing an origin- destination network with the origins and destinations as centers
( population weighted virtual centroids in each tract). Using the same origin- destination network
we also count the number of highways within 5 km, 10 km, and 50 km network distance from
each centroid.
Develop optimality functions and perform assessment
Assemble data for the 7049 tracts of California from US Census 2000 and Network of
Roadways
Compute buffers at 5, 10, and 50 km and 20 and 40 minutes using shortest path
Sum the number of jobs within each
buffer
Sum the number of lane km within each
buffer
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
13
K. Goulias, T. Golob, and S. Y. Yoon
Table 1 A selection of Census- tract characteristics
Mean Std. Dev. Maximum*
Tract Square Km 59.0 453.7 20486.8
Tract Population 4805.2 2143.1 36146.0
Tract Households 1631.8 763.0 8528.0
Within a 5 Km Buffer from Tract Centroid
Workers in Retail ( retail) 5031.1 6937.8 54745.0
Workers in Health ( health) 2644.0 3524.4 26478.0
Workers in Services but not in Health or Retail
( services)
28024.4 44497.0 373127.0
Workers in Manufacturing ( manufacturing) 3391.0 5547.7 59059.0
Workers in All Other Occupations ( other) 5753.4 6805.7 50287.0
Primary limited access roadways ( primary lim) 284.1 448.6 3244.8
Primary without limited access roadways ( primary
nolim)
77.9 140.6 958.5
Secondary and connecting roadways ( secondary) 1867.8 2711.3 17711.4
Rural, local and neighborhood roadways ( local) 8549.4 11256.1 71318.1
Special roadways ( special) 342.1 591.3 4612.7
All Other types of roadways ( other) 778.6 1618.7 10511.1
* The minimum is zero for all variables and tracts
Enjoyment of access is also a function of the tract residents’ ability to take advantage of
opportunities offered to them. We attempt to capture this by including social and demographic
characteristics of the resident population available in the Census tract databases. Transportation
investment is often directed to facilities and the striking majority of this investment is allocated
to roadways. An indicator of transportation supply ( the input in the context of production
functions) is the amount of roadways ( lane kilometres). Roadways, however, serve different
purposes and offer different functions to the users depending on their type ( e. g., limited access
freeways/ motorways, secondary roads connecting limited access roadways, local roads).
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
14
K. Goulias, T. Golob, and S. Y. Yoon
Using Geographic Information Systems, we can identify and count the number of
kilometres of each roadway in each tract. Roadways, however, form a complex network and the
tracts are interconnected. For this reason, we perform a similar task as for activity opportunities
and we count the number of roadways by type in a series of concentric rings of 5km, 5 to 10km,
and 10 to 50km. We name these rings the buffers. We repeat the same operation for travel time
using 20 minutes and 40 minutes travel time. The types of roadways we count are: primary
highways with limited access ( primary lim herein), primary roadways without limited access
( primary nolim herein), secondary and connecting roadways ( secondary herein), local and rural
roads ( local herein), roads with special characteristics ( special herein), all other roadways ( other
herein). On the one hand, we have as input a detailed accounting of roadways representing all
past investment on highways for each origin ( tract centroid). On the other hand, we consider as
output the number of workers a resident departing from a centroid can reach. The types of
workers that are reachable within each of the buffers are classified into: retail, health, services,
manufacturing, and all other. These counts are the indicators capturing access to opportunities to
participate in activities and enjoy services.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
15
K. Goulias, T. Golob, and S. Y. Yoon
3. Optimality Assessment
The literature on optimal assessment of decision making units is largely populated by Data
Envelopment Analysis methods ( a review on a related topic can be found in Alam et al. 2004,
2005, and 2008) and Stochastic Frontiers ( Greene, 1980). Considering the possible measurement
errors in the data used, the presence of outliers, and spatial correlation, we opt for stochastic
frontiers that can handle some of these possibly undermining issues. However, an additional step
is required in our analysis before estimating stochastic frontier production functions. The output
of the number of workers that a resident departing from a centroid can reach is depicted by 25
indicators ( number of workers in retail, health, services, manufacturing, and other within 5km,
within the ring of 5 to 10 km, within the ring of 10 to 50 km, within 20 minutes of travel time,
and within the ring of 20 to 40 minutes travel time). To reduce the data into a few variables we
use factor analysis using the principal components method, extraction based on correlations, and
the varimax method. This yields three components explaining 93% of the variation in the output
variables used here. Each component captures a different aspect of access to opportunities
surrounding each centroid and the three components are derived in such a way to be
uncorrelated. Table 2 provides a summary of the component scores ( high scores indicate high
correlation between the output variable and the component extracted). The first component
represents access of opportunities in the outermost ring between the radius of 50 km and the
radius of 10 km but also within the ring defined by the radii of 20 and 40 minutes and for this
named the outer ring access in this study. One variable, the number of workers in manufacturing
within 20 minutes travel time, is more correlated with the first component than the second
reflecting the predominant location of manufacturing in the outskirts of cities and closer to high
speed roadways. The second component represents access to opportunities in the second ring
and it is most correlated with variables defined in the ring between a radius of 5 km and a radius
of 10 km ( named middle ring access herein) and variables of within 20 minutes of travel time.
Access to opportunities that are the closest to the centroid is represented by the third component
( named core access herein), which is most correlated with the remaining variables. For each
California tract we compute each of the three components ( corresponding to three concentric
regions around each centroid – core, middle ring, outer ring) using the scores of Table 2 and the
value for each variable used to extract them. These three components replace the 25 variables
and are used as the dependent variables in stochastic frontier analysis.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
16
K. Goulias, T. Golob, and S. Y. Yoon
Table 2 The three principal components extracted from 25 output variables and their
scores
Components
Outer
Ring
Access
Middle
Ring
Access
Core
Access
Number of Workers in Retail ( 20 to 40 min) 0.945 0.276 0.139
Number of Workers in Services ( 20 to 40 min) 0.941 0.250 0.128
Number of Workers in Other ( 20 to 40 min) 0.941 0.275 0.150
Number of Workers in Manufacturing ( 20 to 40 min) 0.939 0.245 0.130
Number of Workers in Health ( 20 to 40 min) 0.936 0.287 0.140
Number of Workers in Retail ( 10 to 50 km) 0.927 0.330 0.159
Number of Workers in Manufacturing ( 10 to 50 km) 0.926 0.311 0.129
Number of Workers in Other ( 10 to 50 km) 0.925 0.329 0.157
Number of Workers in Services ( 10 to 50 km) 0.924 0.326 0.163
Number of Workers in Health ( 10 to 50 km) 0.919 0.343 0.169
Number of Workers in Manufacturing ( 0 to 20 min) 0.665 0.625 0.265
Number of Workers in Services ( 5 to 10 km) 0.234 0.878 0.296
Number of Workers in Retail ( 5 to 10 km) 0.322 0.868 0.275
Number of Workers in Other ( 5 to 10 km) 0.380 0.841 0.289
Number of Workers in Health ( 5 to 10 km) 0.267 0.817 0.350
Number of Manufacturing in Services ( 5 to 10 km) 0.438 0.766 0.220
Number of Workers in Services ( 0 to 20 minutes) 0.504 0.703 0.430
Number of Workers in Health ( 0 to 20 minutes) 0.532 0.688 0.421
Number of Workers in Retail ( 0 to 20 minutes) 0.585 0.680 0.389
Number of Workers in Other ( 0 to 20 minutes) 0.605 0.672 0.345
Number of Workers in Services ( 0 to 5 km) 0.071 0.198 0.955
Number of Workers in Retail ( 0 to 5 km) 0.139 0.226 0.942
Number of Workers in Other ( 0 to 5 km) 0.190 0.325 0.871
Number of Workers in Health ( 0 to 5 km) 0.075 0.308 0.839
Number of Workers in Manufacturing ( 0 to 5 km) 0.289 0.354 0.699
Stochastic frontiers were developed for models of production. A production function is
the ideal amount a unit can produce for a given set of inputs. In empirical settings observed
outputs are not ideal ( maximum) for reasons that are due to unknown random factors and
measurement error ( v) that are specific to each observed unit and due to productive inefficiency
that also varies with each observed unit ( u). To examine the relationship between output
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
17
K. Goulias, T. Golob, and S. Y. Yoon
variables ( access to opportunities) and input variables ( highways) a regression model is created
with dependent variable ( y) the indicator of output and independent variables the highway lane
kilometers ( x). The model we use here takes the following form:
i i i i y = α + x' β + v − u
Index i represents each tract, i= 1,…, 7049.
We estimate three regression equations that are one for each of the three components of
Table 2 ( core access, middle ring access, outer ring access). In each equation y is the logarithm
of the component values for each tract. The xs are number of highways of each type in each
geographic subdivision. The vector β contains the regression coefficients we seek. Variable v is
the usual random error term capturing measurement error and variable u is a positive valued
offset between observed access and the ideal maximum possible given the input combination of
roadways within each tract. The random error term v is assumed to be normally distributed with
zero mean and constant variance across observations. The random positive valued term u is
specified as a function of other explanatory variables. In the terminology of production
functions the values ui are the measures of inefficiency for each tract i in transforming lane
kilometers of roadways into access to opportunities. Creating the exp(- ui ) we obtain a measure
of tract specific efficiency.
Estimation of the three models presented here is carried out using LIMDEP ( Greene,
2002). Table 3 shows the regression coefficients associated with each input variable ( number of
lane kilometers of roadway types in the core, the middle ring, and the outer ring). The
correlation between the y variable and its predicted values using the estimated model coefficients
is 0.895 for the outer ring, 0.731 for the middle ring, and 0.744 for the core, representing
excellent goodness of fit between data and the production function derived here.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
18
K. Goulias, T. Golob, and S. Y. Yoon
Table 3 Stochastic Frontier Regression Coefficients
Outer Ring Middle Ring Core
Coeff. t ratio Coeff. t ratio Coeff. t ratio
Constant - 0.413 - 3.13 0.857 13.80 1.685 17.89
Log( primary lim in core) - 0.094 - 1.71 0.203 11.17 0.443 13.01
Log( primary lim in core) 2 - 0.053 - 2.29 0.070 8.48 0.135 8.95
Log( primary nolim in core) 0.016 0.23 - 0.181 - 8.11 0.477 10.25
Log( primary nolim in core) 2 0.001 0.05 - 0.039 - 5.26 0.137 9.17
Log( secondary in core) 0.035 0.94 - 0.195 - 13.96 0.748 25.71
Log( secondary in core) 2 - 0.072 - 5.71 - 0.011 - 2.07 0.172 19.97
Log( local in core) - 0.101 - 3.75 0.091 8.28 - 0.160 - 7.86
Log( local in core) 2 0.021 2.89 0.020 6.55 - 0.100 - 20.05
Log( special in core) 0.068 1.21 - 0.190 - 10.05 - 0.145 - 4.59
Log( special in core) 2 0.045 2.02 - 0.050 - 5.91 - 0.103 - 6.92
Log( other in core) - 0.004 - 0.22 - 0.010 - 1.53 - 0.058 - 5.36
Log( other in core) 2 - 0.003 - 0.47 - 0.001 - 0.59 - 0.024 - 6.66
Log( primary lim in middle ring) 0.098 2.33 - 0.020 - 1.07 - 0.115 - 3.70
Log( primary lim in middle ring) 2 0.077 5.83 - 0.036 - 6.60 - 0.055 - 6.56
Log( primary nolim in middle ring) 0.048 3.44 0.039 9.50 - 0.082 - 9.54
Log( primary nolim in middle ring) 2 0.028 5.13 0.003 1.69 - 0.047 - 13.76
Log( secondary in middle ring) - 0.155 - 3.18 0.146 6.08 - 0.249 - 6.01
Log( secondary in middle ring) 2 - 0.065 - 6.13 0.044 9.09 - 0.062 - 7.06
Log( local in middle ring) 0.025 0.63 - 0.014 - 0.69 0.059 1.69
Log( local in middle ring) 2 0.015 2.14 - 0.020 - 6.01 0.058 10.11
Log( special in middle ring) - 0.083 - 1.78 0.085 4.37 - 0.009 - 0.25
Log( special in middle ring) 2 - 0.071 - 5.22 0.061 11.10 0.012 1.28
Log( other in middle ring) 0.034 1.76 - 0.005 - 0.69 0.042 3.13
Log( other in middle ring) 2 0.021 4.80 0.006 3.93 0.023 8.69
Log( primary lim in outer ring) 0.077 1.47 - 0.012 - 0.36 0.002 0.03
Log( primary lim in outer ring) 2 - 0.051 - 2.56 0.025 2.71 - 0.003 - 0.20
Log( primary nolim in outer ring) - 0.071 - 2.18 0.045 3.16 0.041 1.70
Log( primary nolim in outer ring) 2 0.007 0.75 - 0.018 - 3.85 0.000 - 0.06
Log( secondary in outer ring) - 0.041 - 0.66 0.006 0.17 - 0.008 - 0.13
Log( secondary in outer ring) 2 0.030 2.27 - 0.019 - 2.65 0.010 0.82
Log( local in outer ring) 0.066 1.40 - 0.062 - 1.73 0.006 0.12
Log( local in outer ring) 2 0.007 0.90 0.003 0.62 - 0.010 - 1.33
Log( special in outer ring) - 0.090 - 1.80 0.058 1.97 0.009 0.19
Log( special in outer ring) 2 0.093 6.18 - 0.019 - 2.72 0.006 0.51
Log( other in outer ring) 0.012 0.47 0.005 0.42 0.018 0.82
Log( other in outer ring) 2 - 0.025 - 5.63 0.002 0.74 - 0.008 - 2.18
Constant for u - 0.718 - 8.06 - 17.693 - 14.36 - 0.144 - 3.18
Household density - 0.578 - 69.66 1.059 10.34
Tract perimeter ( km) - 1.375 - 22.05
u v σ / σ 3.797 28.05 13.069 17.89 2.612 45.34
2 2
u v σ = σ + σ
0.680 150.31 1.359 17.65 0.468 77.43
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
19
K. Goulias, T. Golob, and S. Y. Yoon
The signs, size, and significance of the regression coefficients show how the presence
and amount of different types of roadways impact the ability of each geographical tract to
provide access to opportunities. A negative sign associated with roadways in the same region
( core, middle ring, outer ring) of the dependent variable is more likely to indicate competition for
space with businesses and establishments providing services. A positive coefficient is more
likely to indicate a clustering of establishments around those roadway types.
Positive coefficients associated with variables in different regions than the dependent
variable indicate a supportive relationship with access. For example, access to the outer core
may be achieved by driving over local roads in the core, secondary roads in the middle ring, and
again local roads in the outer ring. Different establishments however, may be reached by
different combinations of roadways. As a result we obtain a variety of significance levels, signs,
and sizes of coefficients that may not all correspond to intuition.
As expected, access to the outer core is influenced by roadway quantity in the core, the
middle ring and the outer core. However, lower speed facilities in the core ( local and secondary
roadways) seem to have a stronger influence than the higher speed ( primary roadways). The
middle ring primary roadways have a strong positive impact on access in the outer ring. These
two indications are a reflection of the routes leading to the outer core with high presence of
opportunities. However, if there are many primary roadways in the outer core they compete for
space with the establishments were opportunities locate and this is reflected in a few negative
coefficients associated with roadways in the outer ring ( primary nolim and secondary). Access
to the middle ring is even more heavily influenced by the amount and type of roads in the core
( positively by high speed roadways and negatively by lower speed roadways).
The core access is not influenced by roadways in the outer ring, i. e., a driver does not
need to go into the outer core when reaching places within the 5 km radius around a tract
centroid and this is reflected in the lack of significance for most of the outer ring variables. In
contrast, primary roadways in the middle ring seem to decrease access to the core in a significant
way. This is a reflection of the spatial organization of California’s roadway network and the
spatial distribution of activity opportunities adjacent to the network's roadways. Unfortunately,
all this is also masked by the use of the summary indicators ( i. e., the principal components) as
dependent variables that contain variables from all three regions ( i. e., core, middle ring, and
outer ring).
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
20
K. Goulias, T. Golob, and S. Y. Yoon
When aiming at improving access to opportunities around the core, however, provision of
primary and secondary roadways appears to be a worthwhile investment. When we examine the
other two components that are heavily influenced by variables that include travel time, the
picture is not as clear and may be pointing out to the need for improving travel times in local and
secondary roadways in regions that lead to the middle and outer rings.
The bottom portion of Table 3 contains the estimates of variables influencing
inefficiency. Exp (- ui ) is a measure of technical efficiency and it is the ratio between achieved
access over the maximum possible access achieved for the given inputs. The outer ring and
middle ring efficiencies ( and their opposite inefficiencies) are significantly different among
tracts of different household densities ( households per square kilometers). The core efficiency is
a function of the perimeter of the tract indicating a possible problem with the use of tract as a
unit of analysis. In a series of other specifications not shown here we also find that multi- car
(> 4) households live in tracts with lower efficiency presumably because they are able to combat
lack of access with automobility. Other variables considered such as number of households by
household size did not exhibit a clear trend. The median efficiency indicators are fairly high at
84%, 92%, and 81% for the outer ring, middle ring, and core respectively. The tenth lower
percentiles are 72%, 83% and 62% for the outer ring, middle ring, and core respectively
indicating a fairly good efficiency for a system that evolved without a major plan targeting high
efficiency. However, considering the large size of many tracts access to opportunities may be
quite different among the residents within these tracts ( see also the inequality section below).
The final examination we perform for these computed efficiencies here is by mapping
them for the entire state. Figure 3 shows the three efficiency indicators for Los Angeles,
California, using as cutoff points the 10% percentiles. The first quadrant shows the Los Angeles
total lane kilometers of roadways. Each efficiency estimate captures a different aspect of access
to locations and shows clearly that providing more lane kilometers does not make a geographical
area more accessible for any of the three efficiency measures.
These same efficiency estimates were also computed for the entire state. Figure 4a shows
the core efficiency map at 10% percentile increments. Figure 4b shows the middle ring
efficiency and Figure 4c shows the outer ring efficiency.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
21
K. Goulias, T. Golob, and S. Y. Yoon
Figure 3: Maps of lane kilometers and efficiency measures in Los Angeles, California
Core efficiency
Outer ring efficiency Middle ring efficiency
Total lane kilometers within core
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
22
K. Goulias, T. Golob, and S. Y. Yoon
Figure 4a Core Efficiency Estimates
Core efficiency
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
23
K. Goulias, T. Golob, and S. Y. Yoon
Figure 4b Middle Ring Efficiency Estimates
Middle ring efficiency
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
24
K. Goulias, T. Golob, and S. Y. Yoon
Figure 4C Outer Ring Efficiency Estimates
Outer ring efficiency
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
25
K. Goulias, T. Golob, and S. Y. Yoon
4. Inequality Assessment
In this section a method to highlight the mismatch that exists between the distribution of
the population and the allocation of roads and activity access in California is presented. The
tool we aim with the analysis presented here identifies specific locations in the state where
resource allocation has succeeded in offering a uniform spatial spread of benefits to the public.
In addition, we aim to develop maps that show which locations in a country ( a state in our study)
fail to be equitable, requiring their residents to travel excessively to pursue the same amount of
activities as other residents of different localities. In this section, we answer a few key questions:
Using largely available data, can we develop a small number of variables to describe access
to activity opportunities for California residents?
Is it possible to capture the structure of inequality in accessibility through a multi- scale
analysis?
Can we identify areas that are prime candidates for investment?
To answer these questions the state of California is divided in 22,133 zones using the US
Census 2000 block groups. The Census block group ( unit of analysis here) is selected as a first
order geographical subdivision to make the analysis tractable at the state level and to provide
sufficient detail to be meaningful. We assess each block group in terms of its ability to produce
benefits for its residents and compare each block group with other block groups within a census
tract. We will repeat the same comparison using tracts within counties, and counties within the
state. Figure 5 provides a schematic representation of the study
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
26
K. Goulias, T. Golob, and S. Y. Yoon
Figure 5: Computation Schema of the Inequality Study
Table 4 contains a selection of unit of analysis characteristics. Access to opportunities for
activity participation ( e. g., leisure) and services ( e. g., health) is the benefit ( and output) from
each tract that we will assess. As indicators of available opportunities in a block group, numbers
of workers classified according to the North American Industry Classification System ( NAICS)
were used. The original NAICS classification of fourteen types of industries was aggregated into
five types: retail, health, services, manufacturing, and all other considering the types of activity
in which people can participate related to the industries.
Using Geographic Information Systems ( Network Analyst in ArcGIS 9.1), we identified
the areas reachable within 20 minutes, 40 minutes, and 60 minutes travel time using information
about speed limits on the roadway network at hand. The network data we used for the analyses
have information about types of road network, segment length, speed limit, turning restrictions,
and one way street enabling a somewhat realistic modeling of the travel environment.
Identification of the reachable areas is accomplished by developing two sets of shortest path
networks for the origin- destination matrix of the block group centroids using travel time and
travel distance as travel cost respectively, and querying the block groups by the travel costs.
Combining the reachable areas with the numbers of workers in each block group, accessibility to
Assemble data for the 22,133 block groups of California from US Census 2000 and a 2000
Vintage Network of Roadways
Sum the number of jobs within each
buffer
Sum the number of lane km within each
buffer
Compute buffers at 5, 10, and 50 km and 20, 40, and 60 minutes using shortest path
Develop summary indicators of accessibility
Create THEIL indicators for block groups, tracts, and counties
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
27
K. Goulias, T. Golob, and S. Y. Yoon
activity participation was calculated as enumeration of workers of each industry within each
reachable area.
Table 4 A selection of block group characteristics
Mean Std. Dev. Maximum*
Block Group Square Km 18.51 179.59 12219.12
Block group Population 1530.3 1008.48 36146
Block group Households
Within a 20 min travel time buffer from block group Centroid
Workers in Retail ( retail) 56324.49 48926.91 202513
Workers in Health ( health) 96664.34 89718.16 389816
Workers in Services but not in Health or Retail
( services)
23812.89 23757.93 87798
Workers in Manufacturing ( manufacturing) 80640.04 88937.65 339848
Workers in All Other Occupations ( other) 75843.44 68947.56 270979
Primary limited access roadways ( primary lim) 266.53 206.05 885.86
Primary without limited access roadways ( primary
nolim)
78.4 82.01 552.42
Secondary and connecting roadways ( secondary) 650.52 425.51 2333.31
Rural, local and neighborhood roadways ( local) 2561.13 1782.39 12545.59
Special roadways ( special) 23.2 39.44 483.4
All Other types of roadways ( other) 223.78 275.34 1984.31
* The minimum is zero for all variables and tracts
In a similar way as was done for the tract level, transportation supply is represented by
the amount of roadways ( lane kilometers) by type ( e. g., limited access freeways/ motorways,
secondary roads connecting limited access roadways, local roads) but this time measured at the
level of a US census block group. Using Geographic Information Systems, we can identify and
count the number of kilometers of each roadway type in each block group. Roadways, however,
form a complex network interconnecting the block groups and through the roadway network the
block groups provide activity opportunities to others and also get benefits from others. For this
reason, we perform a similar task as for activity opportunities and we sum up the length of
roadway segments by type in a series of concentric areas that are accessible in 20 minutes, 40
minutes, and 60 minutes of travel time to quantify roadways that are available from an origin that
is considered here as a virtual center of the block group ( named centroid). We name these areas
the buffers ( similarly to the process followed in the previous section). The types of roadways we
count are: primary highways with limited access ( primary lim herein), primary roadways without
limited access ( primary nolim herein), secondary and connecting roadways ( secondary herein),
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
28
K. Goulias, T. Golob, and S. Y. Yoon
local and rural roads ( local herein), roads with special characteristics ( special herein), all other
roadways ( other herein).
On one hand, we have as input a detailed accounting of roadways representing all past
investment on highways for each origin and the number of workers a resident departing from a
centroid can reach. These counts are the indicators capturing access to opportunities to
participate in activities and enjoy services. On the other hand, the main beneficiaries of
transportation policies are the number of persons residing in an origin block group. One
objective in transportation is to maximize accessibility for most persons. However, some
segments of the population receive lower benefits than others. Inequality assessments are needed
then to make comparisons.
The assessment of inequality is very often limited to a few disadvantaged population
segments ( Blumenberg, 2008 - http:// www. opportunitycars. com/ articles/ documents/
20051205_ Blumenberg. pdf) and they do not encompass an entire state or country in their
assessment. In contrast, inequality is a very popular subject in other fields ( Krugman and
Venables, 1995, Schneider et al., 2002, Ghose, 2004). Considering the strong spatial correlation
among accessibility indicators ( due to the connectivity of highway network and the
agglomeration of businesses) we opt for an index of inequality that has a " fractal" nature ( i. e.,
decomposable geographically) and that can handle multiple output variables.
The output of the number of workers that a resident departing from a centroid can reach
is depicted by 25 indicators that are: number of workers in retail, health, services, manufacturing,
and other employment within 5km, within 10 km, within 50 km, within 20 minutes of travel
time, within 40 minutes of travel time, and within 60 minutes of travel time. To reduce the data
into a few variables we use factor analysis using the principal components method and extraction
based on correlations in the same manner we did for the tracts in the efficiency analysis. During
a first stage using all the variables this method produced a few variables that were only
marginally informative ( as expected due to the strong relationship among the 25 variables
considered here) and they were eliminated from further analysis. The reduced set of variables
considered in this analysis produced only one factor that captures 90.03% of the variation in the
variables used here. Table 5 provides a summary of the component scores ( high scores indicate
high correlation between the output variable and the component extracted). For each California
block group we compute this “ accessibility “ factor ( ai, i= 1,…, 22,133). Figure 6 shows the ratio
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
29
K. Goulias, T. Golob, and S. Y. Yoon
ai/ ni with ni the resident persons in each block group. These figures show the disparities that exist
in providing accessibility at each block group. The figures, however, do not reflect the
relationship of accessibilities between block groups and do not provide an indicator that
compares them directly with the overall accessibility of the state and its spatial structure.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
30
K. Goulias, T. Golob, and S. Y. Yoon
Table 5 The factor created using a reduced set of the 25 output variables and their scores
Variable
Loading for
accessibility
factor
NUMBER OF WORKERS IN MANUFACTURING INDUSTRY ( WITHIN 20 MINUTE
BUFFER) 0.8669
NUMBER OF WORKERS IN RETAIL INDUSTRY ( WITHIN 20 MINUTE BUFFER) 0.9233
NUMBER OF WORKERS IN EDUCATION/ HEALTH SERVICE INDUSTRY
( WITHIN 20 MINUTE BUFFER) 0.8957
NUMBER OF WORKERS IN OTHER INDUSTRY ( WITHIN 20 MINUTE BUFFER) 0.8595
NUMBER OF WORKERS IN MANUFACTURING INDUSTRY ( WITHIN 40 MINUTE
BUFFER) 0.9675
NUMBER OF WORKERS IN RETAIL INDUSTRY ( WITHIN 40 MINUTE BUFFER) 0.9881
NUMBER OF WORKERS IN EDUCATION/ HEALTH SERVICE INDUSTRY
( WITHIN 40 MINUTE BUFFER) 0.9828
NUMBER OF WORKERS IN PUBLIC ADMINISTRATION INDUSTRY ( WITHIN 40
MINUTE BUFFER) 0.9538
NUMBER OF WORKERS IN OTHER INDUSTRY ( WITHIN 40 MINUTE BUFFER) 0.9757
NUMBER OF WORKERS IN MANUFACTURING INDUSTRY ( WITHIN 60 MINUTE
BUFFER) 0.9640
NUMBER OF WORKERS IN RETAIL INDUSTRY ( WITHIN 60 MINUTE BUFFER) 0.9719
NUMBER OF WORKERS IN EDUCATION/ HEALTH SERVICE INDUSTRY
( WITHIN 60 MINUTE BUFFER) 0.9700
NUMBER OF WORKERS IN PUBLIC ADMINISTRATION INDUSTRY ( WITHIN 60
MINUTE BUFFER) 0.9490
NUMBER OF WORKERS IN OTHER INDUSTRY ( WITHIN 60 MINUTE BUFFER) 0.9704
PRIMARY HIGHWAY WITH LIMITED ACCESS( WITHIN 20 MINUTE BUFFER) 0.9313
LOCAL, NEIGHBORHOOD, and RURAL ROAD( WITHIN 20 MINUTE BUFFER) 0.9000
PRIMARY HIGHWAY WITH LIMITED ACCESS( WITHIN 20 MINUTE BUFFER) 0.9852
LOCAL, NEIGHBORHOOD, and RURAL ROAD( WITHIN 20 MINUTE BUFFER) 0.9798
PRIMARY HIGHWAY WITH LIMITED ACCESS( WITHIN 40 MINUTE BUFFER) 0.9570
SECONDARY and CONNECTING ROAD( WITHIN 40 MINUTE BUFFER) 0.9478
LOCAL, NEIGHBORHOOD, and RURAL ROAD( WITHIN 40 MINUTE BUFFER) 0.9439
SECONDARY and CONNECTING ROAD( WITHIN 40 MINUTE BUFFER) 0.9760
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
31
K. Goulias, T. Golob, and S. Y. Yoon
Accessibility per capita in each block group Theil index contribution by each block group
Figure 6 Accessibility and Theil Maps
Under ideal data availability we would like to identify every resident of California,
compute an accessibility index associated with each resident and then perform a comparative
analysis to assess who enjoys higher accessibility and who does not. Although this is not an
impossible task with today's modeling and simulation capabilities, it violates one of the initial
requirements of this study of using largely available data to explore new techniques. In addition,
accessibility of one location is related to the accessibilities of its neighbors. We start with block
group subdivisions and compute an indicator that accounts for the distribution of accessibility.
We then consider increasingly larger geographical areas to illustrate the use of the Theil index.
The following equation shows the Theil index computed using the block group data in
California.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
32
K. Goulias, T. Golob, and S. Y. Yoon
Where A is the sum of factor values for the entire state of California ( A= Σak) and N the
population of the entire state of California ( N= Σnk ). For each block group i, we name
respectively accessibility share and population share the ratios ai/ A and ni/ N .
An important advantage of this index over other measures of inequality is its
composition. Each component of the sum in the equation above is a weighted log ratio of the
accessibility over the resident population in the block group. Each component in the Theil index
is then a weighted measure of the mismatch between its accessibility share and its population
share. Thus, our interest will focus on each term of the sum, which we name contribution of the
block group to the Theil index, or Theil contribution.
The right hand side of Figure 6 displays these Theil contributions for each block group.
This map is more instructive than the left hand side one since the block groups are compared to
each other which allows to identify the relative status of each area as compared to the entire state
in possible mismatches. The block groups colored in yellow are those that bring little of no
contribution the Theil index. That means they can enjoy accessibility to roads and activities
opportunities in the right proportion with respect to their population. On the other hand, the
green colored areas are those that have an accessibility share higher that their population share
offering excess advantage. This " over accessibility" is on the detriment of the red color areas for
which the accessibility share is smaller than the population share. Consequently the inhabitants
of those block groups may have to spend more travel time to accomplish the same amount of
every day activities than their counterparts residents who live in advantaged areas. As far as
infrastructure investment is concerned, a public policy aiming at an homogenous development of
the state of California should consider the red colored areas as prime candidates for roadway
connectivity funding allocation ( of course other factors are usually taken into account in
allocating resources). Figure 6 shows that major metropolitan areas such as Los Angeles are
particularly advantaged in accessibility, but it seems that this over- accessibility was built at the
detriment of the block groups that compose their outskirts and the ones that are situated in the
log
22,133
1 i ⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
= Σ=
N
n
A
a
A
T a
i
i
i
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
33
K. Goulias, T. Golob, and S. Y. Yoon
central part of the State. It should be noted, however, that travel time here is computed based on
the speed limit of roadways and therefore does not account for congestion. As a result this
" advantage" of the urban core is somewhat exaggerated in this analysis.
A fractal version of Theil's index enables assessment of inequality across larger regions
as well as within larger regions to account for highway and land use connectivity. This is indeed
the main characteristic that made us prefer the Theil index to all the other indexes developed in
the economics literature. It is decomposable through different levels ( e. g., geographical scales)
and considers, for each scale unit, a between unit component and an intra unit component. In
this way we can also account for heterogeneity within a larger area. As already mentioned
above, a better way to measure inequality would be to consider each resident, but since, as most
of analysts, we are dealing with groups, we have to study how inequality emerges between and
inside these groups. Moreover this fractal approach gives us a deeper understanding of the
spatial structure of inequality through the different levels we study.
In our case study, the different geographical units we consider are the following: the
County, the Tract and the Block group. Figure 7 displays the tree structure of the recursive
calculation here. The state is composed of 58 counties. Each county contains tracts and each
tract contains block groups. The general definition of the fractal Theil index is the following
( Conceicao and Ferreira, 2000).
Where ai is the accessibility of the branch i of the root r, A the total accessibility of the root r, ni
the population of the branch i, N the total population of the root and T( r, i), the Theil index of the
branch i.
( r, i)
Branches
i 1
Branches i
i 1
i . T
A
. log a
A
Σ a Σ
= =
+
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
=
N
n
A
a
T
i
i
r
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
34
K. Goulias, T. Golob, and S. Y. Yoon
Applied to our case study, the formula becomes:
Decomposition of the
state by each county
Decomposition of each
county by each tract
within the county
Decomposition of each
tract by block group
within the tract
Figure 7: Tree structure used for the computation of the fractal Theil index.
California
County 1 County 2 County N
Tract 1 Tract 2 Tract M
Bg 1 Bg 2 Bg 3
County i
NC
i 1 CA
NC i
i 1 CA
i . T
A
. log a
A
Σ a Σ
= =
+
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
=
CA
i
CA
i
CA
N
n
A
a
T
Tract j
Tract j of County i Couty i
j
i
Couty i
j
Tract j of Counnty i Couty i
j
i . T
A
A a
a
. log
A
Σ a + Σ
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
=
County
j
County
N
n
T
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
= Σ
Tract k
Tract j
k
Blockgroup k of Tract j Tract j
k
j
A
a
. log
A
a
N
n
T
k
Tract
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
35
K. Goulias, T. Golob, and S. Y. Yoon
Bringing all these components into one equation leads to the following.
TCA = Between
Countiescontribution
Intra county contribution
Between tracts contribution Intra- tract contribution
Figures 8, 9 and 10 display the results from this equation. Figure 10 is a statewide
summary that displays two kinds of information. First, the contribution of the County to the
Theil index, i. e the measure of the mismatch that exists between its accessibility share and its
population share toward the other Counties. The other information is an “ intra County”
contribution that is actually its own Theil index and measures the inequality that exists between
and inside its own tracts. Consequently, this map allows us to see, not only how advantaged or
disadvantaged a County can be in regard to the others but also if its resources have been equally
or unequally allocated showing the main advantage of the Theil index. It allows to understand
the structure of inequality and its distribution through different geographic levels, and can thus
constitute a decision making tool for public policies. Indeed, this map enables a policy maker to
identify both what are the areas that need the most transport infrastructure for an egalitarian
development of the State, and which regions have allocated their investments to projects that
grant an homogeneous development of their own territory. The map allows to decide if a
statewide equality will be emphasized and investments need to be made accordingly or if
combating local inequality is more important and investments need to be made at a more local
and focused way.
A
a
. log
A
. a
A
A a
a
. log
A
. a
A
. log a
A
a
Tract k
Tract j
k
Blockgroup k Tract j
k
Tract j Couty i
j
i
Couty i
j
Tract j Couty i
NC j
i 1 CA
NC i
i 1 CA
i
⎥ ⎥ ⎥ ⎥
⎦
⎤
⎢ ⎢ ⎢ ⎢
⎣
⎡
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
+
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
+
⎟ ⎟ ⎟ ⎟
⎠
⎞
⎜ ⎜ ⎜ ⎜
⎝
⎛
= Σ Σ Σ Σ Σ
= =
N
n
N
n
N
n
A
a
T
k
County
j
CA
i
CA
i
CA
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
36
K. Goulias, T. Golob, and S. Y. Yoon
Figure 8 Decomposition of the Theil contribution of each California County
Figure 9 Decomposition of the Theil contribution of each California County
without Los Angeles ( 37) and Orange ( 59)
- 0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
1 11 21 31 41 51 61 71 81 91 101 111
County
Decomposition of the Theil contributions for each
County
between county contribution
intra county contribution
- 0.03
- 0.025
- 0.02
- 0.015
- 0.01
- 0.005
0
0.005
0.01
0.015
1 9 17 25 33 43 51 61 69 77 85 93 101 109
County
Decomposition of the Theil contribution for each County
between
intra
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
37
K. Goulias, T. Golob, and S. Y. Yoon
Figure 10 Map of the decomposition of the Theil contribution of each County
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
38
K. Goulias, T. Golob, and S. Y. Yoon
In California the most evident phenomenon that appears is the supremacy of the County
of Los Angeles and Orange County in terms of both “ over accessibility” ( with, for LA, a
contribution almost 45 times larger than the one of the third most advantaged County) and intra
inequality ( an intra inequality index about ten times larger than the one of the third most
inhomogeneous County). This illustrates a property of the Theil index, its sensitivity to
distributional impacts and disparities among the groups considered and in particular to " wealth"
transfers from the disadvantaged to the advantaged. The measure of the mismatch is indeed
amplified by the accessibility share weight ( Conceicao and Ferreira, 2000, pages 12 and 13). Of
course there is also a scale effect in all this. The larger a County is, the more likely it is to have
internal heterogeneity.
Among the other Counties, there is another trend that is worth noting. Counties that show
the most lack of accessibility are also those with the highest intra- county inequality. This points
out to the need for a more detailed study to identify those disadvantaged counties that did not
benefit from large scale infrastructure investment such that would have allowed them to develop
a coherent policy for an homogeneous development of their territory. The findings here show
some sort of negative feedback; the less investment a County receives, the more it is likely to
suffer from territorial disparities.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
39
K. Goulias, T. Golob, and S. Y. Yoon
5. Microanalysis ( Person Based) Analysis
In the development of the microanalysis in this project, we have identified relationships between
travel, household sociodemographic characteristics, spatial accessibility, and road infrastructure.
When considered separately, sociodemographic characteristics, spatial accessibility, and road
infrastructure all influence travel behavior. Dense urban areas make walking trips more feasible;
extensive networks of freeways and arterials encourage vehicular trips; large households make
more trips per day than small households, and so on. However, in the real world, all of these
variables interact simultaneously. Households consider the costs and benefits of different
locations and feasible travel modes in light of their circumstances, and choose residential
locations accordingly. Indeed, one could argue that households are not merely reacting to their
circumstances, but rather are actively trying to improve their lot in any way they can.
Adjustment strategies include moving residence, changing jobs, choosing different travel
destinations, bundling individual single- occupancy vehicle ( SOV) trips into high- occupancy
vehicle ( HOV) trips, and so on. One cannot merely consider the influence of spatial
infrastructure characteristics in isolation and this motivates the development of regression
analyses attempting to take into account multiple factors.
One source of information about individuals and their households is the California
Statewide Travel Survey, conducted over several months in the years 2000 and 2001. It provides
an excellent starting point for disentangling the relationships between space, infrastructure, and
sociodemographics. The survey sample, consisting of more than 17,000 households, is a quota
sample by county and planning region, rather than a representative sample of California
proportional to the population of each county. Each trip destination has been geocoded, usually
to the nearest intersection, but sometimes to the approximate census tract centroid. The location
( geocodes and census tract) of almost every household can also be determined from the survey
data. To this data have been added spatial accessibility variables and roadway infrastructure
variables by census tract and block group computed in the efficiency and inequality analysis
discussed in previous sections of this report. The relatively even distribution of the sample
across all California counties ensures that the data represent a wide variety of spatial
environments.
The need to account for ( control) for sociodemographics when assessing relationships
between travel behavior and spatial factors is revealed when we investigate the residential
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
40
K. Goulias, T. Golob, and S. Y. Yoon
location patterns of sociodemographic groups. Through a series of statistical tests, we
determined that eight categorical sociodemographic variables were paramount in explaining
travel behavior. These variables with their categories and distributions by percent of the sample
are listed in Table 6. Each of these five variables was strongly related to our spatial variables
through residential location.
Table 6 Sociodemographic Variables Used in the Models
Variable % Variable % Variable %
Annual Household income Average age of heads Highest education of head
<$ 10,000 4.3 18- 25 5.8 not high school 9.1
$ 10,000-$ 24,999 14.2 25.5- 35 14.1 high school graduate 24.5
$ 25,000-$ 34,999 13.2 35.5- 45 20.1 Some college 23.7
$ 35,000-$ 49,999 13.9 45.5- 55 22.7 associates degree 7.4
$ 50,000-$ 74,999 19.9 55.5- 65 15.5 bachelors degree 20.9
$ 75,000-$ 99,999 10.9 65.5- 75 11.8 graduate degree 13.4
$ 100,000-$ 149,999 7.4 75.5+ 7.5 Unknown 1.1
$ 150,000+ 3.4 Unknown 2.5 Whether any children < 6
unknown 12.8 Ethnicity of heads Yes 7.5
Household size White 75.5 No 89.4
1 26.4 Hispanic 10.2 Whether any children 6- 12
2 40.8 Black 2.3 Yes 9.3
3 14.4 Asian/ Pacific Islander 1.9 No 85.6
4 11.2 White & Hispanic 3.1 Whether any children 13- 17
5 4.7 White & Asian 1.3 Yes 9.0
6 or more 2.5 Other or unknown 5.8 No 2.9
The residential location patterns for various demographic groups can be seen by graphing the
category means of four key spatial variables for each of our five polychotomous
sociodemographic variables, as shown in Figures 11 through 15. These four spatial variables
are: ( 1) housing density, in terms of households per square kilometer, ( 2) regional retail
accessibility, in terms of total retail workers within 10 and 50 kilometers, ( 3) local road
infrastructure, in terms of total kilometers of local, neighborhood, and rural roads within 10
kilometers, and ( 4) regional non- freeway primary road infrastructure, in terms of total kilometers
of primary roads without limited access within 10 to 50 kilometers. Each of these four spatial
variables are standardized ( zero mean and standard deviation of one) to allow plotting on a single
scale.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
41
K. Goulias, T. Golob, and S. Y. Yoon
Figure 11 Some Spatial Variable Means by Categories of Household Income
Figure 12 Some Spatial Variable Means by Categories of Household Size
- 0.20
- 0.10
0.00
0.10
0.20
0.30
0.40
<$ 10,000 $ 10,000-
$ 24,999
$ 25,000-
$ 34,999
$ 35,000-
$ 49,999
$ 50,000-
$ 74,999
$ 75,000-
$ 99,999
$ 100,000-
$ 149,999
$ 150,000+
Household annual income
standardized variable category mean
households per sq km in tract
retail workers within 10- 50 km
local roads within 10km
primary roads w/ o limited access within 10- 50 km
- 0.20
- 0.15
- 0.10
- 0.05
0.00
0.05
0.10
0.15
0.20
1 2 3 4 5 6 or more
Household size
standardized variable category mean
households per sq km in tract
retail workers within 10- 50 km
local roads within 10km
primary roads w/ o limited access within 10- 50 km
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
42
K. Goulias, T. Golob, and S. Y. Yoon
Figure 13 Some Spatial Variable Means by Average Age of Household Heads
Figure 14 Spatial Variable Means by Ethnicity of Household Heads
- 0.15
- 0.10
- 0.05
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
18- 25 25.5- 35 35.5- 45 45.5- 55 55.5- 65 65.5- 75 75.5+
Average age of household heads
standardized variable category mean
households per sq km in tract
retail workers within 10- 50 km
local roads within 10km
primary roads w/ o limited access within 10- 50 km
- 0.20
0.00
0.20
0.40
0.60
0.80
1.00
1.20
white hispanic black asian/ Pacific
islander
white &
hispanic
white & asian other or
unknown
Ethnicity of Household head( s)
standardized variable category mean
households per sq km in tract
retail workers within 10- 50 km
local roads within 10km
primary roads w/ o limited access within 10- 50 km
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
43
K. Goulias, T. Golob, and S. Y. Yoon
Figure 15 Spatial Variable Means by Education of Household Heads
The income dimension of residential location of households is most strongly related to
regional retail accessibility ( Figure 11). With the exception of the lowest income groups, higher
income households tend to be located in areas surrounded by the highest retail activity. Higher
income households ( households with incomes of $ 100,000 or more in 2000 dollars) are also
located in regions with the highest levels of primary road infrastructure. With respect to local
road infrastructure, households in the highest and lowest income classes tend to be located in
areas with the greatest density of local roads; there is no statistically significant difference ( p =
.01) between local road densities among the six middle income classes. Finally the only
statistically significant relationship between income and housing density is that the lowest
income households reside in denser census tracts; otherwise income is not a factor in housing
density. With regard to household size, all spatial effects involve single- person households as
distinguished from multi- person households. For each of the four spatial variables graphed in
- 0.40
- 0.30
- 0.20
- 0.10
0.00
0.10
0.20
0.30
0.40
not high school
grad
high school some college associates
degree
bachelors degree graduate degree
Highest education of household head
standardized variable category mean
households per sq km in tract
retail workers within 10- 50 km
local roads within 10km
primary roads w/ o limited access within 10- 50 km
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
44
K. Goulias, T. Golob, and S. Y. Yoon
Figure 12, there is no statistically significant difference among multi- person households of
different sizes. As opposed to income, household density at the tract level is most strongly
related to household size. The patterns of residential location as a function of age of the
household head( s) revolves around decreases in density by age groups up until the 45.5- 55
category, after which there are no statistically significant effects ( Figure 13). The strongest
relationship is that between age and housing density, followed by a moderately strong
relationship between age and the density of local road infrastructure. Black and Asian
households tend to locate in the highest density areas, in terms of all four spatial measures
( Figure 14). In terms of one of these variables, local road infrastructure, Black households reside
in areas that are even denser than Asian households. Excluding Black and Asian households,
there are still statistically significant differences between the other ethnic groups. Hispanic and
mixed White and Asian households live in areas that are denser than those resided by White and
mixed White and Hispanic households. The residential location patterns of households
according to education of household head, which is also a proxy for occupation, are shown in
Figure 15. The strongest relationship is for regional primary road infrastructure. Less educated
households tend to reside in areas with the greatest regional coverage of surface arterial primary
roads. Households with associated degrees reside in areas with the lowest coverage of regional
coverage of surface arterials, and the same low density for those with associate degrees is true
for the other dimensions, especially regional retail accessibility. There is a similar, but less
pronounced pattern for local roads. Finally, households in the highest education segments reside
in areas with high residential density, compared with households in the middle education
segments. The presence of young children does not appear to be a major factor in residential
location, as there are no significant relationships involving between the indicator variable of
young children and any of our four key spatial variables. Households with children aged six
through twelve tend to be located in areas with lower housing density. There are no statistical
relationships with retail accessibility or local or regional primary arterial road coverage.
Households with children aged six through twelve tend to be located in areas with lower housing
density, with lower regional retail accessibility, and in areas with lower coverage of roads. This
discussion points to the need for further analysis that accounts for these sociodemographics and
goes at least one step further in the spatial unit within which these households reside to study the
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
45
K. Goulias, T. Golob, and S. Y. Yoon
impact of all this on travel. We do the remaining analysis with regression models that account
for multiple influences on travel behavior.
6. Microanalysis Using Regression Models
In each of the models that follow, three blocks of variables are tested: ( 1) the same set of
sociodemographic variables, ( 2) residential and activity site density variables, and ( 3) any road
infrastructure variables found to be significant in explaining the dependent travel behavior
variable after controlling for the first two sets of variables. Spatial variables were derived using
buffer areas ( e. g., around the population centroid of a census tract such as retail employees
within 10 km of a census tract). Several such measures were developed, using both time and
distance to define the boundary of the buffer. Based on preliminary data exploration only the 10
km and 50 km buffer variables were found to have a substantial effect and are used here. Some
shorter time buffers could have been used and would have produced similar results, but the 10km
and 50km distances were found to be more effective in capturing the influence of infrastructure
provision and access to activity opportunities. The shortest distance buffer zone indicators are
tested both in direct and difference ( ring) format.
Modeling the contribution of spatial accessibility and infrastructure density was further
complicated by the presence of spikes at zero and long positive tails. For example, some rural
census tracts in California are extremely large with a very small population concentrated in a
small portion of the tract. These need to be modeled together with census tracts that have some
of the highest densities of roadway infrastructure in the nation. To overcome this distributional
heterogeneity, spatial variables were converted to a scale in which the population was ranked
into ten groups of equal frequency ( deciles). This relieves the estimation bias caused by outlying
observations and restrictions to the positive domain with spikes at zero value. It also facilitates
estimation in which the spatial variables can contribute nonlinear and even non- ordinal effects.
We present omnibus tests of each set of variables, but the variable coefficients are shown
only for the final complete model. These coefficients are displayed as odds ratios; the raw
coefficient can be computed as the natural logarithm of the odds ratio. To aid in interpretation,
only statistically significant ( p = .05) coefficients are listed. All variables are categorical, and the
continuous spatial variables are discretized into ten equal categories ( deciles).
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
46
K. Goulias, T. Golob, and S. Y. Yoon
In the following sections we present the results of six sets of models aimed at assessing
the influence of the spatial environment on travel demand in California. The first model
identifies which households contain adults ( persons eighteen and older) who are non- drivers, as
this special group is an important component of passenger, transit, and nonmotorized demand.
The second set of models deals with public transport ( transit) demand, and we estimate separate
models for transit demand by any household member and by adult drivers only. Similar sets of
models are then estimated for nonmotorized travel and for high occupancy vehicle ( HOV) travel.
The latter set also contains a model of the HOV travel time. The final set of models is for solo
driving, with one model for household solo driving demand, and one model for solo driving
distance.
We also analyze the impact of spatial aggregation ( e. g., tract level measurement versus
block group measurement) level on power of the models. The same procedure of variable
computation was conducted using block groups, which are smaller than census tracts, and the
same six sets of models were built using the block group variables. The potentially deleterious
impact may arise from the modifiable areal unit problem ( MAUP; Openshaw and Alvanides,
1999). MAUP is one of the important issues that should be considered when we use GIS.
Artificial boundaries imposed on continuous geographical phenomena, such as accessibility,
results in the generation of artificial spatial patterns, and the spatial patterns generated in
different levels of spatial aggregation differ from each other. We analyze the existence and the
impact of MAUP in the six sets of travel behavior models and show how spatial variables at
different aggregation levels can be used in the models to mitigate this artificial spatial resolution
considering the impact of unit area sizes. In the models that follow we show estimation results
using census tract accessibility variables and sociodemographics and estimation results using
combinations of block group level variables.
6.1 Adults Who Do Not Drive
A substantial portion of household travel behavior that does not involve driving is due to the
constrained choices of non- driving adults. California has a reputation for being an automobile-oriented
state. While this reputation may be somewhat unfair, it is indeed the case that most of
the state was developed over the past 50 years, and so it is designed and developed with the
automobile in mind. This contrasts sharply with older east coast or European cities. One might
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
47
K. Goulias, T. Golob, and S. Y. Yoon
presume therefore that households with non- driving adults might choose to locate in denser
areas, where walking, bicycling and public transport are well supported. We test this hypothesis
in the model presented in this section. This model uses 16,949 observations, or 99.5% of the
sample with complete data. In this sample, 11.6% of households have non- driving adults.
6.1.1 Census Tract Model
The contributions of the three variable sets in explaining which households have non- driving
adults are captured in the omnibus statistical tests listed in Table 7. All eight of the
sociodemographic variables were important, but only one spatial variable, housing density, was
found to be significant in describing the residential location of these households, controlling for
their socioeconomic characteristics. Road infrastructure was not significantly different than
zero.
Table 7 Binary Logit Model of Presence in Household of Non- driving Adult
Variable set Contribution of set Cumulative model
Chi- square Degrees of
freedom Chi- square Degrees of
freedom
Nagelkerke
R2
Sociodemographic 2678.10 35 2678.10 35 .285
Spatial density 51.60 9 2729.70 44 .290
Road infrastructure ( not significant)
The statistically significant influences of the sociodemographic variables are listed in Table 8.
Income and household size display monotonic effects, and age highlights the expected elderly
outcome. Non- driving adults are more likely to be found in Hispanic and Black households, and
in households in the lowest education groups. Finally, households with children present are less
likely to have non- driving adults, regardless of the ages of the children.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
48
K. Goulias, T. Golob, and S. Y. Yoon
Table 8 Logit Model of Presence of Non- driving Adult – Sociodemographic
Independent variable Significance Odds ratio
Income ( base = unknown) 0.00
<$ 10,000 0.00 5.910
$ 10,000-$ 24,999 0.00 3.093
$ 25,000-$ 34,999 0.00 1.510
$ 35,000-$ 49,999
$ 50,000-$ 74,999 0.00 0.653
$ 75,000-$ 99,999 0.00 0.491
$ 100,000-$ 149,999 0.00 0.366
$ 150,000+ 0.00 0.250
household size ( base = 6 or more) 0.00
1 0.00 0.073
2 0.00 0.217
3
4 0.00 1.706
5 0.00 4.200
Average age of heads ( base = unknown) 0.00
18- 25 0.00 0.702
25.5- 35 0.00 0.691
35.5- 45 0.00 0.625
45.5- 55
55.5- 65
65.5- 75 0.03 1.189
75.5+ 0.00 2.998
Ethnicity ( base = unknown) 0.00
White 0.00 0.620
Hispanic 0.00 1.723
Black 0.00 1.482
Asian/ Pacific Islander 0.04 0.694
White & Hispanic 0.02 0.726
White & Asian
Education ( base = unknown) 0.00
not high school graduate 0.00 1.862
high school graduate 0.00 1.269
Some college
associates degree 0.01 0.766
bachelors degree 0.00 0.713
graduate degree 0.00 0.687
presence of children 0- 5 yrs. Old 0.00 0.262
presence of children 6- 12 yrs. Old 0.00 0.326
presence of children 13- 17 yrs. Old 0.00 0.369
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
49
K. Goulias, T. Golob, and S. Y. Yoon
As shown in Table 9, households with non- driving adults are less likely to be located in low
density residential areas ( e. g, the lowest quartile of residential density), and more likely to be
located in the very highest density areas. There is no statistically significant relationship
between accessibility or road infrastructure and the likelihood of the presence of non- driving
adults. In other words, households with non- driving adults are most likely not choosing where
they live to accommodate the travel behavior of their non- driving members. Consequently, in
the travel behavior models that follow, the accessibility and infrastructure effects are not
attributable to the contribution to travel behavior of non- driving adults.
Table 9 Logit Model of Presence of Non- driving Adult – Spatial Density ( Tract)
Ind. variable ( all bases = 50th % tile) Significance Odds ratio
tract household density 0.00
< 10 % tile 0.03 0.828
10th % tile 0.00 0.752
20th % tile ( 0.06) ( 0.849)
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.00 1.608
6.1.2 Comparison with Block Group Model
In Table 10, the contributions of sociodemographics, spatial variables measured at the tract level,
and spatial variables measured at the block group level are compared in terms of their
contribution to goodness of fit. The impact of the sociodemographic variables on the presence of
non- driving adult in households is almost identical in the census tract model and the block group
model of non- driving adults. Because the measurement of retail employee within a certain travel
distance involves shortest distance, and the calculation of household density does not, using
smaller spatial unit has different model implications. When we use a smaller spatial unit, it
means that we consider a smaller “ neighborhood( s)” around home locations for household
density computation, but it means closer approximation for measurements involving shortest
travel distance. To see the influence of using smaller spatial units on each variable set, the
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
50
K. Goulias, T. Golob, and S. Y. Yoon
contribution of household density and retail employee density are given separately in Table 10.
Only household density has a significant impact on non- driving adults, and it contributes more to
the model when it is measured using the larger spatial unit areas, census tracts in this case.
Additional estimation details are also offered by Table 11 for spatial density.
Table 10 Binomial Logit Models of Presence in Household of Non- driving Adult
Model Variable set
Contribution of set Cumulative model
Chi- square
Degrees of
freedom
Chi- square
Degrees of
freedom
Nagelkerke
R2
Census
tract
Sociodemographic 2678.10 35 2678.10 35 .285
Spatial density 51.60 9 2729.70 44 .290
Household density 51.60 9
Retail employee - -
Road infrastructure ( not significant)
Block
group
Sociodemographic 2678.69 35 2678.69 35 .285
Spatial density 37.54 9 2716.22 44 .289
Household density 37.54 9
Retail employee - -
Road infrastructure ( not significant)
Table 11 Logit Models of Presence of Non- driving Adult – Spatial Density
Ind. Variable ( all bases = 50th % tile) Census tract Block group
Significance Odds ratio Significance Odds ratio
household density 0.00 0.00
< 10 % tile 0.03 0.828 0.03 0.835
10th % tile 0.00 0.752 ( 0.06) ( 0.850)
20th % tile ( 0.06) ( 0.849) 0.02 0.816
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.00 1.608 0.00 1.435
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
51
K. Goulias, T. Golob, and S. Y. Yoon
6.2 Transit Usage by Any Household Member
Transit usage is defined as taking any local transit mode, including bus, rail, and light rail, but
not including long distance bus trips. School bus trips are also included as household public
transport trips. Of the 16,750 households with complete data ( 98.3% of the sample), 8.1% had a
household member who made at least one trip by public transport ( transit); the highest
concentration of these households being in the San Francisco Bay Area, where 14.4% of
households in this sample were transit users.
6.2.1 Census Tract Model
Compared to the previous model for households with non- driving adults, socioeconomic factors
are less effective in explaining which households are transit users, but there are three significant
spatial factors, and one road infrastructure variable is important as shown in Table 12.
Table 12 Logit Model of Any Household Transit Use and Spatial Density at Tract Level
Variable set
Contribution of set Cumulative model
Chi- square Degrees of
freedom Chi- square Degrees of
freedom
Nagelkerke
R2
Sociodemographic 1633.28 35 1633.28 35 .216
Spatial density 177.90 27 1811.19 62 .238
Road infrastructure 81.57 9 1892.76 71 .248
The estimated effects of the sociodemographic variables are reported on Table 13. Transit usage
is a decreasing function of income ( where statistically insignificant categories are shown to
complete the picture), and an increasing function of household size. Transit usage is generally a
decreasing function of age of the household head( s), but usage is greatest for the second
youngest group, and lowest for the second oldest group. Transit services for the elderly probably
increase the likelihood of transit usage for households with the oldest household heads.
Education is not an effective predictor of transit usage, and only one ethnicity category is
important: black households are 1.6 times more likely to use transit. Regarding children,
households with only young children are less likely to use transit, while those with older children
are more likely to use transit.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
52
K. Goulias, T. Golob, and S. Y. Yoon
Table 13 Logit Model of Any Household Transit Use – Sociodemographic Variables
Independent variable Significance Odds ratio
Income ( base = unknown) 0.00
<$ 10,000 0.00 2.175
$ 10,000-$ 24,999 0.00 1.381
$ 25,000-$ 34,999 0.03 1.198
$ 35,000-$ 49,999
$ 50,000-$ 74,999 0.00 0.810
$ 75,000-$ 99,999 ( 0.29) ( 0.909)
$ 100,000-$ 149,999 ( 0.26) ( 0.886)
$ 150,000+ 0.00 0.379
household size ( base = 6 or more) 0.00
1 0.00 0.376
2 0.00 0.491
3
4 0.00 1.416
5 0.00 1.737
Average age of heads ( base = unknown) 0.00
18- 25 0.05 1.267
25.5- 35 0.00 1.488
35.5- 45 0.00 1.255
45.5- 55
55.5- 65
65.5- 75 0.00 0.554
75.5+ 0.03 0.691
Ethnicity ( base = unknown) 0.09
White
Hispanic
Black 0.00 1.618
Asian/ Pacific Islander
White & Hispanic
White & Asian
Education ( base = unknown) 0.41
not high school graduate
high school graduate
some college
associates degree
bachelors degree
graduate degree
presence of children 0- 5 yrs. old 0.01 0.775
presence of children 6- 12 yrs. Old 0.00 2.363
presence of children 13- 17 yrs. old 0.00 3.001
Spatially, as expected, transit- using households are concentrated in the densest 10% of residential areas,
and also in the least dense 20% of areas, as shown in Table 14. But excluding areas in the highest 10% of
housing density, households located in areas above median density are less likely to use transit. Census
tracts with low density housing tend to be located in rural counties. While the presence of school age
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
53
K. Goulias, T. Golob, and S. Y. Yoon
children in the household coupled with the inclusion of school bus trips as public transit trips may account
for some of this effect, this result underscores the importance of rural public transport.
Table 14 Logit Model of Any Household Transit Use – Spatial Density
Ind. Variable ( all bases = 50th % tile) Significance Odds ratio
tract household density 0.00
< 10 % tile ( 0.20) ( 1.215)
10th % tile 0.04 1.262
20th % tile
30th % tile
40th % tile
60th % tile 0.01 0.758
70th % tile ( 0.43) ( 0.919)
80th % tile 0.00 0.725
90th % tile 0.00 1.677
retail employees within 10 km 0.00
< 10 % tile
10th % tile
20th % tile 0.02 0.755
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.00 1.795
retail employees within 10 to 50km 0.00
< 10 % tile ( 0.19) ( 0.764)
10th % tile 0.00 0.594
20th % tile 0.01 0.664
30th % tile
40th % tile
60th % tile 0.02 1.381
70th % tile
80th % tile
90th % tile 0.00 2.140
Accessibility to retail services, particularly accessibility at the regional level ( 10 to 50 km), indicates
lower transit usage for households located in low accessibility areas, and high transit usage for households
located in the highest 10% of retail accessibility. This effect undoubtedly captures the urban core
phenomenon. The influence of road infrastructure is complex, as shown in Table 15. Controlling for
sociodemographic factors and spatial density, households that live in areas in the lower quartile of
regional primary surface road coverage ( primary roads without limited access within 10 to 50 km of
network distance) exhibit the highest transit usage, together with households in the 80th percentile.
However, households above the 90th percentile have very low transit usage. Once again, the importance
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
54
K. Goulias, T. Golob, and S. Y. Yoon
of rural public transport is picked up by the road infrastructure variable, even when controlling for
housing and retail density. In tracts with both low housing density and lower levels of road infrastructure,
the likelihood of transit usage is unusually high.
Table 15 Logit Model of Any Household Transit Use – Road Infrastructure
Variable ( Bases = 50th % tile) Significance Odds ratio
primary roads w/ o limited access within 10 to 50 km 0.00
< 10 % tile 0.02 1.678
10th % tile 0.02 1.466
20th % tile 0.04 1.401
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile 0.01 1.483
90th % tile 0.00 0.364
6.2.2 Comparison with Block Group Model
As shown in Table 16, household density contributes slightly more to the model when it is
measured at the census tract level, and the other spatial variable sets – retail employee density
and road infrastructure - contribute more to the model when they are measured based on block
groups. Especially, the road infrastructure in the block group model contributed almost twice as
much as in the census tract model in terms of chi- square.
Table 16 Logit Models of Any Household Transit Use
Model Variable set
Contribution of set Cumulative model
Chi- square Degrees of
freedom Chi- square Degrees of
freedom
Nagelkerke
R2
Census
tract
Sociodemographic 1633.28 35 1633.28 35 .216
Spatial density 177.90 27 1811.19 62 .238
Household density 125.45 9
Retail employee 52.45 18
Road infrastructure 81.57 9 1892.76 71 .248
Block
group
Sociodemographic 1633.58 35 1633.58 35 .216
Spatial density 180.37 27 1813.95 62 .238
Household density 106.50 9
Retail employee 73.87 18
Road infrastructure 158.66 9 1972.60 71 .258
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
55
K. Goulias, T. Golob, and S. Y. Yoon
The spatial density variables show similar impact patterns on household transit usage in the
block group analysis, too. However, in the block group model, the concentration of transit usage
in the highest density area is stronger and the concentration in 10th percentile of household
density is not captured. The highest percentile of the block group retail employee density had
higher impact in both buffers ( 0 to 10 km and 10 to 50 km). This can be a typical influence of
MAUP. First, different sizes of unit area produce different statistics, household density in this
case, and they reveal different patterns of influences. The patterns can have different impact in
the models as the variable sets do in the Logit model of household transit use ( Table 17).
Second, different levels of spatial aggregation lead to different levels of approximation of the
explanatory variables. From the comparison between the two models of household transit use, it
appears that a better approximation of an explanatory variable by going one level of
disaggregation down ( from tract to block group) improves the contribution of the independent
variables by explaining variation in the dependent variable.
The influence pattern of road infrastructure of the block group model is similar to that of
the census tract model, but in addition to primary roads without limited access within 10 to
50km, which was the only road infrastructure variable set significant in the census tract model of
household transit usage, local roads variables were found to be significant in the block group
model ( Table 18). In the block group model, the importance of rural public transportation is also
picked up, and the likelihood of transit usage is low in the households which belong to the
highest 10% road network areas.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
56
K. Goulias, T. Golob, and S. Y. Yoon
Table 17 Logit Models of Any Household Transit Use – Spatial Density
Ind. Variable ( all bases = 50th % tile) Census tract Block group
Significance Odds ratio Significance Odds ratio
household density 0.00 0.00
< 10 % tile ( 0.20) ( 1.215)
10th % tile 0.04 1.262 ( 0.16) ( 1.157)
20th % tile
30th % tile
40th % tile
60th % tile 0.01 0.758
70th % tile ( 0.43) ( 0.919) 0.01 0.756
80th % tile 0.00 0.725 0.05 0.810
90th % tile 0.00 1.677 0.00 1.627
retail employees within 10 km 0.00 0.00
< 10 % tile
10th % tile
20th % tile 0.02 0.755 ( 0.08) ( 0.806)
30th % tile ( 0.09) ( 0.821)
40th % tile
60th % tile 0.02 0.752
70th % tile
80th % tile
90th % tile 0.00 1.795 0.00 2.218
retail employees within 10 to 50km 0.00 0.00
< 10 % tile ( 0.19) ( 0.764)
10th % tile 0.00 0.594 0.01 0.700
20th % tile 0.01 0.664 0.00 0.668
30th % tile
40th % tile
60th % tile 0.02 1.381
70th % tile
80th % tile 0.02 1.389
90th % tile 0.00 2.140 0.00 3.294
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
57
K. Goulias, T. Golob, and S. Y. Yoon
Table 18 Logit Models of Any Household Transit Use – Road Infrastructure
Ind. Variable ( all bases = 50th % tile) Census tract Block group
Significance Odds ratio Significance Odds ratio
primary roads w/ o limited access
within 10 to 50 km 0.00 0.00
< 10 % tile 0.02 1.678 0.05 1.249
10th % tile 0.02 1.466
20th % tile 0.04 1.401
30th % tile
40th % tile
60th % tile ( 0.09) ( 1.180)
70th % tile 0.00 1.314
80th % tile 0.01 1.483
90th % tile 0.00 0.364 0.00 0.505
Local roads within 10 km 0.03
< 10 % tile
10th % tile
20th % tile
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.02 0.681
Local roads within 10 to 50 km 0.00
< 10 % tile 0.03 1.389
10th % tile
20th % tile
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.00 0.400
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
58
K. Goulias, T. Golob, and S. Y. Yoon
6.3 Transit Usage by an Adult Driver in the Household
Analyzing transit trips made by any household member can be difficult to interpret, as children
and non- driving adults may be skewing the results for some households but not others. The next
model describes transit usage by adult drivers, being those adults who were either recorded as
having a driver's license, or else were observed to have driven at least once. Only 2.7% of the
14,160 households with adult drivers and complete data have an adult driver that makes at least
one transit trip.
6.3.1 Census Tract Model
As expected, it is much more difficult to predict which households these are, based on
sociodemographic factors, as seen by comparing the goodness- of- fit log- likelihood- ratio model
Chi- square statistics and the pseudo- R2 indices in Table 19. However, spatial density is
relatively more important in the case of adult drivers, and the same road infrastructure variable is
also significant.
Table 19 Logit Model of Household Transit Use by Adult Drivers
Variable set
Contribution of set Cumulative model
Chi- square Degrees of
freedom Chi- square Degrees of
freedom
Nagelkerke
R2
Sociodemographic 216.32 35 216.32 35 .068
Spatial density 282.90 27 499.22 62 .155
Road infrastructure 64.76 9 563.98 71 .175
The sociodemographic predictors of transit usage by adult drivers are shown in Table 20.
Such usage is concentrated in low income households, larger households, households in the
middle age groups ( 35 to 55), black households, and more highly educated households. This
latter effect probably captures central business district employment. Households less likely to
have adult driver transit usage are high and middle income households, small households,
households with heads in the 65- 75 year range, lower educated households, and households with
children.
Tables 21 and 22 show that the effects of rural public transport ( tracts with low density
housing and road infrastructure) disappear when the focus is restricted to adult drivers. Still,
controlling for sociodemographic factors, households that live in areas with the highest
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
59
K. Goulias, T. Golob, and S. Y. Yoon
residential and retail density are the heaviest transit users. The phenomenon of low relative
transit usage households in the 90th percentile of regional primary surface road coverage still
prevails, as seen in Table 22. Households in the 90th percentile of regional primary arterial
coverage are concentrated in Orange, Los Angeles, and San Mateo, and Alameda Counties, but
there are also such households located in San Bernardino, Santa Clara, Riverside and Ventura
Counties. An abundance of primary arterials appears to correlate with fewer household transit
trips in these areas.
6.3.2 Comparison with Block Group Model
As shown in Table 23, we can see the influence of using smaller unit areas in this comparison,
too. Household density contributes more to the model when it is measured at the census tract
level, and the other spatial variables contribute more to the model when they are measured at the
block group level. Table 24 shows the likelihood of transit usage by adult drivers is relatively
low among households in the 90th percentile of primary and local roads coverage as shown in the
census tract model. However, the local roads variable set in the block group model still show the
effect of rural public transport usage by adults drivers and the 70th percentile of primary road
infrastructure had positive impact in the block group model, which couldn’t be seen in the census
tract model. Table 25 shows the likelihood of transit usage by adult drivers was found to be the
highest in the households in the 90th percentile of spatial density as it was in the census tract
model. High transit usage in the 40th percentile of household density was marginally significant,
which was not found in the census tract model. The impact of the highest deciles of retail
employee density was higher and also clearer in the block group model.
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
60
K. Goulias, T. Golob, and S. Y. Yoon
Table 20 Logit Model of Household Transit Use by Adult Drivers – Sociodemographic
Independent variable Significance Odds ratio
Income ( base = unknown) 0.00
<$ 10,000 0.00 3.144
$ 10,000-$ 24,999
$ 25,000-$ 34,999
$ 35,000-$ 49,999
$ 50,000-$ 74,999 0.00 0.614
$ 75,000-$ 99,999
$ 100,000-$ 149,999
$ 150,000+ 0.00 0.454
household size ( base = 6 or more) 0.00
1 0.00 0.413
2 0.00 0.446
3 0.00 0.640
4 0.02 1.379
5 0.00 1.879
Average age of heads ( base = unknown) 0.01
18- 25
25.5- 35
35.5- 45 0.01 1.418
45.5- 55 0.00 1.419
55.5- 65
65.5- 75 0.00 0.487
75.5+
Ethnicity ( base = unknown) 0.03
White
Hispanic
Black 0.01 1.749
Asian/ Pacific Islander
White & Hispanic
White & Asian
Education ( base = unknown) 0.00
not high school graduate 0.03 0.610
high school graduate
some college
associates degree
bachelors degree 0.01 1.465
graduate degree 0.00 1.867
presence of children 0- 5 yrs. old 0.00 0.382
presence of children 6- 12 yrs. Old 0.00 0.425
presence of children 13- 17 yrs. old 0.01 0.589
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
61
K. Goulias, T. Golob, and S. Y. Yoon
Table 21 Logit Model of Household Transit Use by Adult Drivers – Spatial Density
Ind. variable ( all bases = 50th % tile) Significance Odds ratio
tract household density 0.00
< 10 % tile
10th % tile
20th % tile
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.00 2.335
retail employees within 10 km 0.00
< 10 % tile
10th % tile
20th % tile
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile 0.02 1.622
90th % tile 0.00 2.148
retail employees within 10 to 50 km 0.06
< 10 % tile
10th % tile
20th % tile 0.03 0.499
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.00 2.644
Table 22 Logit Model of Household Transit Use by Adult Drivers – Infrastructure
Variable ( base = 50th % tile) Significance Odds ratio
primary roads w/ o limited access within 10 to 50 km 0.00
< 10 % tile
10th % tile
20th % tile
30th % tile
40th % tile
60th % tile 0.02 0.522
70th % tile
80th % tile
90th % tile 0.01 0.418
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
62
K. Goulias, T. Golob, and S. Y. Yoon
Table 23 Logit Models of Household Transit Use by Adult Drivers
Model Variable set
Contribution of set Cumulative model
Chi- square Degrees of
freedom Chi- square Degrees of
freedom
Nagelkerke
R2
Census
Tract
Sociodemographic 216.32 35 216.32 35 .068
Spatial density 282.90 27 499.22 62 .155
Household density 205.39 9
Retail employee 77.51 18
Road infrastructure 64.76 9 563.98 71 .175
Block
Group
Sociodemographic 216.34 35 216.34 35 .068
Spatial density 297.52 27 513.86 62 .159
Household density 180.12 9
Retail employee 117.40 18
Road infrastructure 116.93 18 630.76 80 .195
Table 24 Logit Models of Household Transit Use by Adult Drivers – Infrastructure
Ind. Variable ( all bases = 50th % tile) Census tract Block group
Significance Odds ratio Significance Odds ratio
primary roads w/ o limited access
within 10 to 50 km 0.00 0.00
< 10 % tile
10th % tile ( 0.07) ( 0.701)
20th % tile
30th % tile
40th % tile
60th % tile 0.02 0.522 ( 0.10) ( 1.317)
70th % tile 0.00 1.905
80th % tile
90th % tile 0.01 0.418 0.00 0.367
Local roads within 10 to 50 km 0.01
< 10 % tile 0.01 2.272
10th % tile
20th % tile
30th % tile
40th % tile
60th % tile
70th % tile
80th % tile
90th % tile 0.00 0.295
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
63
K. Goulias, T. Golob, and S. Y. Yoon
Table 25 Logit Models of Household Transit Use by Adult Drivers – Spatial Density
Ind. Variable ( all bases = 50th % tile) Census tract Block group
Significance Odds ratio Significance Odds ratio
household density 0.00 0.00
< 10 % tile
10th % tile
20th % tile 0.02 0.514
30th % tile
40th % tile 0.05 1.397
60th % tile
70th % tile
80th % tile
90th % tile 0.00 2.335 0.00 1.981
retail employees within 10 km 0.00 0.00
< 10 % tile
10th % tile
20th % tile 0.01 0.443
30th % tile 0.02 0.533
40th % tile
60th % tile
70th % tile
80th % tile 0.02 1.622 ( 0.08) ( 1.395)
90th % tile 0.00 2.148 0.00 3.061
retail employees within 10 to 50km 0.06 0.00
< 10 % tile
10th % tile
20th % tile 0.03 0.499 ( 0.09) ( 0.649)
30th % tile 0.04 0.593
40th % tile
60th % tile ( 0.07) ( 0.641)
70th % tile
80th % tile 0.00 2.400
90th % tile 0.00 2.644 0.00 5.484
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
64
K. Goulias, T. Golob, and S. Y. Yoon
6.4 Nonmotorized Travel by Any Household Member
Of our 16,750 households with complete data ( 98.3% of the sample), 14.2% had a household
member that made at least one trip walking or by bicycle. As in the case of transit, the highest
concentration of these households was in the San Francisco Bay Area, where 25.9% of the
households in this survey recorded a nonmotorized trip segment, followed by Santa Barbara
County, with 19.2% of households.
6.4.1 Census Tract Model
Compared to transit- using households, it is more difficult to explain households that generate
nonmotorized travel ( Table 26). However, spatial factors are relatively more important in
nonmotorized travel demand.
Table 26 Logit Model of Any Household Nonmotorized Travel
Variable set
Contribution of set Cumulative model
Chi- square Degrees of
freedom Chi- square Degrees of
freedom
Nagelkerke
R2
Sociodemographic 1065.65 35 1065.65 35 .116
Spatial density 373.08 27 1438.73 62 .147
Road infrastructure 104.31 18 1543.04 80 .158
The sociodemographic predictors of household nonmotorized travel are listed in Table 27. As
expected, the presence of children older than 6 increases the likelihood of a household making a
nonmotorized trip, while the presence of very young children decreases that likelihood. Lower
income and the youngest households are more likely to make nonmotorized trips, but so are the
most highly educated households. With regard to influences of the built environment on
nonmotorized travel ( Tables 28 and 29), the “ rural” effect is somewhat different for
nonmotorized trips than for public transport trips. Here low housing density produces a lower
propensity for nonmotorized trips, confirming that extreme distances among activities inhibit the
use of slower modes. It is possible that for some households rural transit trips is taking the place
of rural nonmotorized trips. In terms of road infrastructure, Table 29 shows that the lower
percentiles have much higher propensity for nonmotorized trips, as is the case for transit. Higher
levels of road infrastructure correspond to lower levels of nonmotorized trips. Both of these
effects are perhaps related to using nonmotorized trips as a form of recreation, as it is pleasant to
FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008
65
K. Goulias, T. Golob, and S. Y. Yoon
walk or bike in less developed, low traffic areas, while it is both unpleasant and dangerous to
walk or bike in highly developed, high traffic areas.
6.4.2 Comparison with Block Group Model
The contribution of household density is larger in the census tract model, and the contributions of
the other variable sets are larger in the block group model for household nonmotorized travel, too
( Table 30 and Table 31). As shown in Table 32, the block group models also show that low
household and retail employee density produces a lower propensity for nonmotorized trips, but
the impact of retail employee d
Click tabs to swap between content that is broken into logical sections.
| Rating | |
| Title | A statewide optimal resource allocation tool using geographic information systems, spatial analysis, and regression methods |
| Subject | TE228.A1 P36 no. 2008-27; Resource allocation--California.; Geographic information systems--California.; Spatial analysis (Statistics); Regression analysis. |
| Description | Performed in cooperation with the California Dept. of Transportation and the Federal Highway Administration.; "November 2008."; Includes bibliographical references (p. 115-116). |
| Creator | Goulias, Konstadinos G. |
| Publisher | California PATH Program, Institute of Transportation Studies, University of California at Berkeley |
| Contributors | Golob, Thomas F.; Yoon, Seo Youn.; California. Dept. of Transportation.; University of California, Berkeley. Institute of Transportation Studies.; Partners for Advanced Transit and Highways (Calif.) |
| Type | Text |
| Language | eng |
| Relation | Also available online.; http://www.path.berkeley.edu/PATH/Publications/PDF/PRR/2008/PRR-2008-27.pdf; http://worldcat.org/oclc/302029790/viewonline |
| Title-Alternative | Statewide optimal resource allocation tool using GIS, spatial analysis, and regression methods |
| Date-Issued | [2008] |
| Format-Extent | 116 p. : ill., charts, maps ; 28 cm. |
| Relation-Is Part Of | California PATH research report, UCB-ITS-PRR-2008-27; PATH research report ; UCB-ITS-PRR-2008-27. |
| Transcript | ISSN 1055- 1425 November 2008 This work was performed as part of the California PATH Program of the University of California, in cooperation with the State of California Business, Transportation, and Housing Agency, Department of Transportation, and the United States Department of Transportation, Federal Highway Administration. The contents of this report reflect the views of the authors who are responsible for the facts and the accuracy of the data presented herein. The contents do not necessarily reflect the official views or policies of the State of California. This report does not constitute a standard, specification, or regulation. Final Report for Task Order 5315 CALIFORNIA PATH PROGRAM INSTITUTE OF TRANSPORTATION STUDIES UNIVERSITY OF CALIFORNIA, BERKELEY A Statewide Optimal Resource Allocation Tool Using Geographic Information Systems, Spatial Analysis, and Regression Methods UCB- ITS- PRR- 2008- 27 California PATH Research Report Konstadinos G. Goulias, Thomas F. Golob, Seo Youn Yoon CALIFORNIA PARTNERS FOR ADVANCED TRANSIT AND HIGHWAYS A Statewide Optimal Resource Allocation Tool Using Geographic Information Systems, Spatial Analysis, and Regression Methods FINAL REPORT Konstadinos G. Goulias Department of Geography & GeoTrans Laboratory University of California Santa Barbara Santa Barbara CA 93106 805- 284- 1597 Goulias@ geog. ucsb. edu Thomas F. Golob Institute of Transportation Studies University of California Irvine Seo Youn Yoon Department of Geography & GeoTrans Laboratory University of California, Santa Barbara Project: PATH Task Orders 5110 & 6110 A GIS- based Tool for Forecasting the Travel Demands of Demographic Groups within California – An Optimal Resource Allocation Tool October 2008 Santa Barbara, CA FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 1 K. Goulias, T. Golob, and S. Y. Yoon Table of Contents Executive Summary ............................................................................................................................... ......... 2 1. Introduction ............................................................................................................................... ............. 8 2. Background ............................................................................................................................... ........... 10 3. Optimality Assessment ........................................................................................................................... 15 4. Inequality Assessment ............................................................................................................................ 25 5. Microanalysis ( Person Based) Analysis ................................................................................................... 39 6. Microanalysis Using Regression Models .................................................................................................. 45 6.1 Adults Who Do Not Drive ......................................................................................................................... 46 6.1.1 Census Tract Model ........................................................................................................................... 47 6.1.2 Comparison with Block Group Model .............................................................................................. 49 6.2 Transit Usage by Any Household Member ................................................................................................ 51 6.2.1 Census Tract Model ........................................................................................................................... 51 6.2.2 Comparison with Block Group Model .............................................................................................. 54 6.3 Transit Usage by an Adult Driver in the Household .................................................................................. 58 6.3.1 Census Tract Model ........................................................................................................................... 58 6.3.2 Comparison with Block Group Model .............................................................................................. 59 6.4 Nonmotorized Travel by Any Household Member .................................................................................... 64 6.4.1 Census Tract Model ........................................................................................................................... 64 6.4.2 Comparison with Block Group Model .............................................................................................. 65 6.5 Nonmotorized Travel - by an Adult Driver in the Household .................................................................... 71 6.5.1 Census Tract Model ........................................................................................................................... 71 6.5.2 Comparison with Block Group Model .............................................................................................. 75 6.6 High Occupancy Vehicle ( HOV) Demand ( Driving with Anyone as a Passenger) ................................... 77 6.6.1 Census Tract Model ........................................................................................................................... 78 6.6.2 Comparison with Block Group Model .............................................................................................. 81 6.7 Adult Driver as a Passenger in an HOV ..................................................................................................... 84 6.7.1 Census Tract Model ........................................................................................................................... 84 6.7.2 Comparison with Block Group Model .............................................................................................. 84 6.8 Adult HOV Passenger Travel Time ........................................................................................................... 87 6.8.1 Census Tract Model ........................................................................................................................... 87 6.8.2 Comparison with Block Group Model .............................................................................................. 90 6.9 Solo Driving Demand - Household Solo Driving ...................................................................................... 92 6.9.1 Census Tract Model ........................................................................................................................... 92 6.9.2 Comparison with Block Group Model .............................................................................................. 93 6.10 Adult Solo Driving Time ........................................................................................................................... 97 6.10.1 Census Tract Model ........................................................................................................................... 97 6.10.2 Comparison with Block Group Model .............................................................................................. 98 7. Models Combining Sociodemographics and Spatial Variables from Tracts and Block Groups .................... 102 7.1 Nonmotorized Travel by any Household Member ................................................................................... 102 7.2 Nonmotorized Travel by an Adult Driver in the Household .................................................................... 104 7.3 Adult HOV Passenger Travel Time ......................................................................................................... 107 8. Summary and Conclusions ................................................................................................................... 109 9. Next Steps ............................................................................................................................... ........... 114 References ............................................................................................................................... .................. 115 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 2 K. Goulias, T. Golob, and S. Y. Yoon Executive Summary The overall objective of this project is to develop an optimal resource allocation tool for the entire state of California using Geographic Information Systems and widely available data sources. As this tool evolves it will be used to make investment decisions in transportation infrastructure while accounting for their spatial and social distribution of impacts. Tools of this type do not exist due to lack of suitable planning support tools, lack of efforts in assembling data and information from a variety of sources, and lack of coordination in assembling the data. Suitable planning support tools can be created with analytical experimentation to identify the best methods and the first steps are taken in this project. Assembly of widely available data is also demonstrated in this project. Coordination of fragmented jurisdictions remains an elusive task that is left outside the project. When this project begun we confronted some of these issues and embarked in a path of feasibility demonstration in the form of a pilot project that gave us very encouraging results. In spite of this pilot nature aiming at demonstration of technical feasibility, substantive conclusions and findings are also extracted from each analytical step. In this project we have two parallel analytical tracks that are a statewide macroanalysis ( called the zonal based approach herein) and an individual and household based microanalysis ( called the person based approach herein). In the statewide macroanalysis we study efficiency and equity in resource allocation. Resources are intended as infrastructure availability and access to activity participation offered by the combined effect of transportation infrastructure and land use measured by indicators of accessibility. Stochastic frontiers are used to study efficiency and a particular type of inequality measurement called the Theil fractal inequality index is used to study equity in the macroanalysis. The outcome of this analysis are maps identifying places in California that enjoy higher levels of service when compared to the entire state and places which succeeded in allocating resources in a relatively better way than others. In the individual microanalysis we use the accessibility indicators from the macronalysis and expand them by defining a new set of indicators at a second level of spatial ( dis) aggregation. Then we use them as explanatory factors of travel behavior with focus on the use of different travel models ( e. g., driving alone, use of public transportation and so forth). As expected infrastructure availability and accessibility to activity opportunities has a significant and substantive effect on the use of different modes. Many resource allocation decisions, then, will impact behavior, which in turn influences the optimality and equity conditions. This implies that decisions about where and FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 3 K. Goulias, T. Golob, and S. Y. Yoon when to allocate resources in public and private transportation needs to account for changes in behavior in a dynamic fashion, using scenarios of accessibility provision and assessing their impact by studying activity and travel behavior changes. There are four distinct work tasks that we describe in this report. First, we assembled statewide spatial US census data at two levels of nested geographic subdivisions that are the tract level and the block group level and merged them with a highway network of the same vintage ( year 2000). Each subdivision is considered as a center and around each center we create travel time and travel distance buffers. Within each buffer we compute the amount of persons working in each industry ( retail, education, health, manufacturing, and all other activities) to represent the spatial opportunities to participate in activities available to the residents of each virtual center. We also count the number of facility kilometers to represent the supply of infrastructure. Second, we use the data from the first task to study the ability of each area in providing services to its residents and then we compare all these areas and rank them based on stochastic frontiers, which is a regression method. We named this method the efficiency measurement because it allows to link infrastructure provision ( as the input) to the accessibility offered ( as the output). Stochastic frontier analysis captures and depicts the complex set of relationships among highways and accessibility showing that providing more roadways is not always better for access to opportunities. This happens either because of competition for space and/ or because the spatial distribution of activity opportunities does not follow these roadways but obeys other spatial distribution rules. The regression results also show that the role of roadways depends on the measurement indicator used but also the presence of other surrounding roadways. Overall, however, the presence of primary roadways has a strong positive impact on access. For core access the secondary roadways seem to have a much higher impact and merit attention for investment. Efficiency in the transformation of roadways to access depends on the residents of each tract and depends on the measurement of access ( outer ring vs. middle ring). Third, we demonstrate a method that identifies specific locations in the entire state where resource allocation has succeeded in maximizing benefits to the public. Using a derived factor of accessibility for the population residing in each block group an index for the entire state was computed that measures the disparities in accessibility featured by the block groups in regard to their population. This same index can thus constitute a first tool for policy makers who consider equality as a criterion of allocation of infrastructure investment. Then we implement a fractal FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 4 K. Goulias, T. Golob, and S. Y. Yoon ( an index based on the nested spatial structure of counties, tracts, and block groups) inequality index ( called the Theil index) that gives us a better understanding of the spatial distribution of inequality throughout different geographical scales. This index gives information about the disparities in accessibility between Counties as well as inside the Counties themselves. The Theil index we implemented here constitutes a tool both easy to understand thanks to its intuitive definition, easy to implement since it relies on data that are largely available, and able to give instructive information about the structure of inequality in providing access to residents. It shows which locations in California fail to be equitable and require their residents to travel excessively to pursue the same amount of activities as residents of other locations where travelling enables better time allocation. Fourth, the wealth of the spatial indicators developed using information from census tracts, census block groups, and the extensive roadway network in California were used as explanatory variables in regression models of travel behavior. Each set of these accessibility capturing variables affects different travel behaviors in different ways. Household density, retail employee density and road infrastructure provided meaningful explanation of the variety in travel behaviors we observe capturing the impact of different dimensions of accessibility such as characteristics of residential area, availability of activity opportunity, and connectivity through road infrastructure. From the model estimation experiments a variety of findings emerge. From the comparisons between the census tract models and the block group models, we see that the variables describing a behavioral aspect can show different levels and patterns of impact on travel behaviors when they are measured using different areal unit sizes. To be more specific, household density measured in census tracts explained better the behavior analyzed here than household density measured using block groups. From the comparisons, we see that census tracts, covering a larger area around a residence, capture the density impact in more informative ways. However, this cannot be the golden rule for every travel behavior indicator. We need to think about the implications that a specific areal unit has on each travel behavior indicator, test its ability to explain behavior, and decide to use the one that is the most informative. In addition, spatial variables involving shortest paths in computation showed better ability of discerning the impacts of each spatial segment and also clearer impact patterns of each variable set when they are computed using smaller unit areas than when they are computed using larger unit areas. Smaller unit areas provide closer approximation of the variables and those variables seem to be FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 5 K. Goulias, T. Golob, and S. Y. Yoon less susceptible to measurement error than variables computed using larger geographical units. However, the trade- off between obtaining closely approximated explanatory variables and the computational effort required for smaller areal units has to be considered when we decide which areal unit we want to use. In fact, the improvement in the goodness of fit of some regression models was marginal or even totally absent. Moreover, the two aggregation levels used here have their own inherent advantages and disadvantages. Consequently, we also demonstrate building models using spatial variables from both geographic levels with some clear benefits in explanation and goodness of fit. Overall, however, land use density and supply of roadways are strong and significant explanatory sets of variables and they provide a good candidate for linking land use to travel behavior in policy impact assessments. In terms of efficiency and inequality, the regression models show that even when investments are done to improve efficiency and/ or inequality they will impact different behaviors in different ways and their overall impact may not necessarily benefit individuals because different impacts in different facets of behavior may counteract each other. The total effect on the overall daily travel patterns of individuals and groups of individuals exceeds the scope of this project. The only tractable existing method to track these impacts is microsimulation ( computer- based synthetic generation of activity and travel patterns of individuals), which is gaining popularity among practitioners. We believe this project was an immense success as a feasibility pilot study. Existing data sources can be “ mined” to extract general useful indications about efficiency and inequality. The same data sources can also be used to gain informative insights about travel behavior and to begin unraveling the complex relationships between infrastructure investments and behavior. Due to pragmatic considerations in the design of the tool presented here many limitations do not allow this tool to be used immediately as a planning support system for statewide policy and decision making. Early during the project design phase we discovered there was no comprehensive clearinghouse of statewide information about transportation projects that tracks them from their inception to the final implementation and impact assessment. Assembly of data from a variety of sources to build a database of all the transportation projects and their impacts would have exceeded the scope and time budget of this project. For this reason we approximated infrastructure supply using an inventory of highways in an existing network database. Similarly, we neglected accounting for public transportation facility and network supply. Moreover, we use as highway speed the reported speed limit for each network link, which we know does not FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 6 K. Goulias, T. Golob, and S. Y. Yoon represent prevailing speeds and varies throughout the day, days of the week, and many other seasonal rhythms. These considerations point to one of the next steps, which is to create a project that, on the one hand, builds a data warehouse of public and private investments and related projects and, on the other hand, develops a statewide multimodal network that is updated yearly with additions and added documentation about the quality of the infrastructure components represented by the network. Technology to accomplish both steps exists but institutional support is not readily available at this time. The entire analysis was done using data from the year 2000. The data are from products such as the Census Transportation Planning Package and a roadway network vintage 2000. The household behavior data span a few months in 2000 and 2001. As a result all the analytical findings are for that period and may not be extendable to other times. This analysis should be expanded to include other years. Opportunities for new data are multiplying due to the American Community Survey, which in 2010 will most likely release its 5- year estimates for areas with a population of less than 20,000, including census tracts and block groups. This may provide an unprecedented opportunity to study the evolution of accessibility in our state and identify the places and their social and demographic groups that benefitted the most by pinpointing geographic areas that increased or decreased residents’ accessibility. Comparisons between the year 2000 and 2010 will reveal changes over time and identify areas in California that benefited the most and areas that benefitted the least. If the project information warehousing activity mentioned above is accomplished, we could also distinguish between successful and unsuccessful projects using the tools and ideas in our project. In the third major area of next steps we can expand the microanalysis to a more comprehensive treatment of travel behavior. This includes activity participation and interactions among household members, trip consolidation in the form of tours, and also the more traditional analysis of trip making. In addition to offering a more detailed picture of the impact that infrastructure and density of opportunities cause on travel behavior, this next step has also the potential to improve the statewide transportation model maintained by Caltrans. This last area of analysis is also a fruitful research direction in developing a next generation of land use transportation integrated models. This is an active area of graduate student and faculty research in the University of California Transportation Center ( www. uctc. net). FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 7 K. Goulias, T. Golob, and S. Y. Yoon The tasks in this report involved researchers from the University of California Santa Barbara ( UCSB) and University of California at Irvine ( UCI). The overall project principal investigator is Kostas Goulias at UCSB. At Irvine Tom Golob with assistance from James Marca extracted a travel behavior database from the California travel survey of 2000 and estimated the first round of travel behavior equations utilizing US Census tract level accessibility indicators. At UCSB Val Noronha and Bryan Krause converted network and US Census data into usable variables at the tract level, computed a first set of accessibility indicators, and developed maps in GIS. During the first part of the project and based on this work a variety of issues were identified, solutions sketched by Kostas Goulias, presented at a series of presentations, and finalized in the second part of the project. A second set of accessibility indicators based on the US census block group data were then computed by Seo Youn Yoon and Kostas Goulias at UCSB that also estimated a new set of travel behavior models. They also estimated the stochastic frontier models used in efficiency measurement. Emmanuel Kemmel, Seo Youn Yoon, and Kostas Goulias also developed the Theil index computations. The first two sections of this report provide a brief presentation of the study background and design. The third section provides a summary efficiency measurement and computations using US census tract level data and a detailed road network as well as stochastic frontiers. In the fourth section we show the inequality assessment using US census block group data and the Theil computations. This is followed by the fifth section that shows distribution of past allocation of road infrastructure across a variety of sociodemographic segments. The sixth section is dedicated to a variety of model estimation experiments to show the impact of provision of infrastructure and accessibility on travel behavior. This is followed by a seventh section that demonstrates the use of spatial variables calculated at two different but nested geographic levels and the benefit of using them jointly. In the last two sections we provide a brief summary and an outline of three recommended next steps. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 8 K. Goulias, T. Golob, and S. Y. Yoon 1. Introduction Optimal allocation of resources for infrastructure facilities is a critical issue in planning for development but it is also a critical consideration for the every day life of travelers. In addition to optimal allocation, equally important is also the distribution of benefits in terms of infrastructure facilities ( stock) and related quality of service intended here as the ability to reach desired destinations within an acceptable amount of time ( service). Different regions of California have received over the years different levels of investment for private or public transportation. The residents at each of these regions are also “ investing” time to travel from one location to another. These are inputs to a production system that has many outputs including local gross product ( e. g., regional gross product) and time allocated by the residents to activities ( e. g., time for paid work, time dedicated to leisure and so forth). Depending on local circumstances each region is more or less efficient in maximizing the use of these stock and service resources. Tools exist to judge how efficiently systems work but they focus on economic efficiency and they do not incorporate a comprehensive measure of transportation stock and service offered. Here, we emphasize social efficiency and bring measures of accessibility in the arsenal of resource management and resource allocation to show the degree of efficiency exhibited by different regions in enabling its residents to minimize personal costs and maximize personal benefits. The research findings presented in this report contain a two- component research program as mentioned in the preface above. The state of California is divided into geographical areas and each is treated as a production unit with its inputs represented by the different types of infrastructure ( e. g., lane miles of roadways classified in a finite number of types). The outputs are indicators of the service offered to the unit’s residents in terms of the amount of activities the residents of each geographical area can reach. Figure 1 provides a summary of the schema used in this project. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 9 K. Goulias, T. Golob, and S. Y. Yoon Figure 1 This project's schema Stock of facilities - Highways by type Activity opportunities surrounding each zone Opportunities measured by persons in occupations in rings Consider distance and travel time INPUT OUTPUT Human capital - Persons and households - Household composition - Car ownership Activity and travel behavior Residents FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 10 K. Goulias, T. Golob, and S. Y. Yoon 2. Background Typical studies of transportation investment and economic development are discussed in Berechman ( 1994), Buffington et al. ( 1992), Perera ( 1990), Seskin ( 1990), and Weisbrod and Beckwith ( 1992). There are also regional studies addressing the impact of transportation infrastructure on local regional economic development. Assessment of these investments is based on the Gross Domestic ( Regional) Product or private output as in Allen et al. ( 1988) and Wilson et al. ( 1985), benefit- cost ratios and/ or differences as in Buffington et al. ( 1992) and Weisbrod and Beckwith ( 1992), property values as in Palmquist ( 1982) and new business creation or location as in Hummon et al. ( 1986). Analytical methods in these studies include: a) assessment of the effects of transportation infrastructure investments that compare and contrast the effects of investments among different regions; and b) identification of the important factors that influence and enable economic development. The study here belongs to the first group of analytical methods. Identification of the impacts from transportation infrastructure investment is particularly important when resources are scarce. From the perspective of decision makers, need assessment and accurate measurement of this need allows effective budgeting and financing of projects. It also allows for informed decisions while evaluating individual projects, balanced distribution of resources, and increased efficiency. Considerable research exists in the analysis of investment and optimal allocation of resources. Transportation improvements influence economic development, productivity, and social welfare. “ Pure” economic development impacts are usually regional in nature and result from improved access to labor pools or to larger markets. While considering the economic development of different regions of a country, investment in transportation infrastructure as well as in the overall infrastructure system may play significant role in removing regional economic disparities. Within the same country and under the same development policies, significant role for transportation implies that regions with better transportation infrastructure will have better access to the locations of materials and markets making them more productive, competitive and hence more successful than regions with inferior transportation accessibility. Better accessibility and mobility also plays a significant role in human resource development of a region. For a review and an application using Data Envelopment Analysis, see Alam et al. 2004, an example of longitudinal analysis Alam et al. 2005, and a Stochastic Data Envelopment Analysis see Alam et al., 2008, and project by project economic assessment in Gkritza et al., 2008. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 11 K. Goulias, T. Golob, and S. Y. Yoon One could make similar arguments when considering the time expenditures of individuals and households to paid and unpaid work as well as free time with family and friends. However, transportation investment from a “ social efficiency” viewpoint is absent from transportation practice. This is mainly due to the lack of tools capable to assess the role of transportation investment on the efficient allocation of time by the residents of each locality. The tool we aim with the analysis presented here identifies specific locations in the state where resource allocation has succeeded in maximizing benefits to the public. In addition, we aim to develop maps that show which locations in a state fail to be optimal and require their residents to travel excessively to pursue the same amount of activities as other residents of different localities. More specifically in this report, we answer four key questions: Using largely available data, can we develop a small number of variables to describe access to activity opportunities for California residents? Are more roadways improving access to these activity opportunities? Are these roles different for different types of highways and how? Can we identify roadways that are prime candidates for investment? In this analysis the state of California is divided in 7049 zones using the US Census 2000 tracts. The Census tract ( unit of analysis here) is selected as a first order geographical subdivision to make the analysis tractable at the state level and to provide sufficient detail to be meaningful ( we will repeat this analysis with a smaller geographic unit and revisit this aspect in the conclusions). We assess each tract in terms of its ability to produce benefits for its residents. Figure 2 provides a schematic representation of the study and Table 1 contains a selection of unit of analysis characteristics. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 12 K. Goulias, T. Golob, and S. Y. Yoon Figure 2: Computation Schema of the Study Envisioning each tract as a production unit and developing for each tract a production function, we measure access to opportunities, treat them as outputs, and correlate them to the presence of roadways within and surrounding the tract. Access to opportunities for activity participation ( e. g., leisure) and services ( e. g., health) is the benefit ( and output) from each tract that we will assess. Using Geographic Information Systems we compute for each tract the amount of activity opportunities reachable within 5 km, 5 to 10 km, and 10 to 50 km. We repeat the same for 20 minutes and 20 to 40 minutes travel time computed using information about speed limits on the roadway network at hand. Computation of these measures is accomplished by developing an origin- destination network with the origins and destinations as centers ( population weighted virtual centroids in each tract). Using the same origin- destination network we also count the number of highways within 5 km, 10 km, and 50 km network distance from each centroid. Develop optimality functions and perform assessment Assemble data for the 7049 tracts of California from US Census 2000 and Network of Roadways Compute buffers at 5, 10, and 50 km and 20 and 40 minutes using shortest path Sum the number of jobs within each buffer Sum the number of lane km within each buffer FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 13 K. Goulias, T. Golob, and S. Y. Yoon Table 1 A selection of Census- tract characteristics Mean Std. Dev. Maximum* Tract Square Km 59.0 453.7 20486.8 Tract Population 4805.2 2143.1 36146.0 Tract Households 1631.8 763.0 8528.0 Within a 5 Km Buffer from Tract Centroid Workers in Retail ( retail) 5031.1 6937.8 54745.0 Workers in Health ( health) 2644.0 3524.4 26478.0 Workers in Services but not in Health or Retail ( services) 28024.4 44497.0 373127.0 Workers in Manufacturing ( manufacturing) 3391.0 5547.7 59059.0 Workers in All Other Occupations ( other) 5753.4 6805.7 50287.0 Primary limited access roadways ( primary lim) 284.1 448.6 3244.8 Primary without limited access roadways ( primary nolim) 77.9 140.6 958.5 Secondary and connecting roadways ( secondary) 1867.8 2711.3 17711.4 Rural, local and neighborhood roadways ( local) 8549.4 11256.1 71318.1 Special roadways ( special) 342.1 591.3 4612.7 All Other types of roadways ( other) 778.6 1618.7 10511.1 * The minimum is zero for all variables and tracts Enjoyment of access is also a function of the tract residents’ ability to take advantage of opportunities offered to them. We attempt to capture this by including social and demographic characteristics of the resident population available in the Census tract databases. Transportation investment is often directed to facilities and the striking majority of this investment is allocated to roadways. An indicator of transportation supply ( the input in the context of production functions) is the amount of roadways ( lane kilometres). Roadways, however, serve different purposes and offer different functions to the users depending on their type ( e. g., limited access freeways/ motorways, secondary roads connecting limited access roadways, local roads). FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 14 K. Goulias, T. Golob, and S. Y. Yoon Using Geographic Information Systems, we can identify and count the number of kilometres of each roadway in each tract. Roadways, however, form a complex network and the tracts are interconnected. For this reason, we perform a similar task as for activity opportunities and we count the number of roadways by type in a series of concentric rings of 5km, 5 to 10km, and 10 to 50km. We name these rings the buffers. We repeat the same operation for travel time using 20 minutes and 40 minutes travel time. The types of roadways we count are: primary highways with limited access ( primary lim herein), primary roadways without limited access ( primary nolim herein), secondary and connecting roadways ( secondary herein), local and rural roads ( local herein), roads with special characteristics ( special herein), all other roadways ( other herein). On the one hand, we have as input a detailed accounting of roadways representing all past investment on highways for each origin ( tract centroid). On the other hand, we consider as output the number of workers a resident departing from a centroid can reach. The types of workers that are reachable within each of the buffers are classified into: retail, health, services, manufacturing, and all other. These counts are the indicators capturing access to opportunities to participate in activities and enjoy services. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 15 K. Goulias, T. Golob, and S. Y. Yoon 3. Optimality Assessment The literature on optimal assessment of decision making units is largely populated by Data Envelopment Analysis methods ( a review on a related topic can be found in Alam et al. 2004, 2005, and 2008) and Stochastic Frontiers ( Greene, 1980). Considering the possible measurement errors in the data used, the presence of outliers, and spatial correlation, we opt for stochastic frontiers that can handle some of these possibly undermining issues. However, an additional step is required in our analysis before estimating stochastic frontier production functions. The output of the number of workers that a resident departing from a centroid can reach is depicted by 25 indicators ( number of workers in retail, health, services, manufacturing, and other within 5km, within the ring of 5 to 10 km, within the ring of 10 to 50 km, within 20 minutes of travel time, and within the ring of 20 to 40 minutes travel time). To reduce the data into a few variables we use factor analysis using the principal components method, extraction based on correlations, and the varimax method. This yields three components explaining 93% of the variation in the output variables used here. Each component captures a different aspect of access to opportunities surrounding each centroid and the three components are derived in such a way to be uncorrelated. Table 2 provides a summary of the component scores ( high scores indicate high correlation between the output variable and the component extracted). The first component represents access of opportunities in the outermost ring between the radius of 50 km and the radius of 10 km but also within the ring defined by the radii of 20 and 40 minutes and for this named the outer ring access in this study. One variable, the number of workers in manufacturing within 20 minutes travel time, is more correlated with the first component than the second reflecting the predominant location of manufacturing in the outskirts of cities and closer to high speed roadways. The second component represents access to opportunities in the second ring and it is most correlated with variables defined in the ring between a radius of 5 km and a radius of 10 km ( named middle ring access herein) and variables of within 20 minutes of travel time. Access to opportunities that are the closest to the centroid is represented by the third component ( named core access herein), which is most correlated with the remaining variables. For each California tract we compute each of the three components ( corresponding to three concentric regions around each centroid – core, middle ring, outer ring) using the scores of Table 2 and the value for each variable used to extract them. These three components replace the 25 variables and are used as the dependent variables in stochastic frontier analysis. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 16 K. Goulias, T. Golob, and S. Y. Yoon Table 2 The three principal components extracted from 25 output variables and their scores Components Outer Ring Access Middle Ring Access Core Access Number of Workers in Retail ( 20 to 40 min) 0.945 0.276 0.139 Number of Workers in Services ( 20 to 40 min) 0.941 0.250 0.128 Number of Workers in Other ( 20 to 40 min) 0.941 0.275 0.150 Number of Workers in Manufacturing ( 20 to 40 min) 0.939 0.245 0.130 Number of Workers in Health ( 20 to 40 min) 0.936 0.287 0.140 Number of Workers in Retail ( 10 to 50 km) 0.927 0.330 0.159 Number of Workers in Manufacturing ( 10 to 50 km) 0.926 0.311 0.129 Number of Workers in Other ( 10 to 50 km) 0.925 0.329 0.157 Number of Workers in Services ( 10 to 50 km) 0.924 0.326 0.163 Number of Workers in Health ( 10 to 50 km) 0.919 0.343 0.169 Number of Workers in Manufacturing ( 0 to 20 min) 0.665 0.625 0.265 Number of Workers in Services ( 5 to 10 km) 0.234 0.878 0.296 Number of Workers in Retail ( 5 to 10 km) 0.322 0.868 0.275 Number of Workers in Other ( 5 to 10 km) 0.380 0.841 0.289 Number of Workers in Health ( 5 to 10 km) 0.267 0.817 0.350 Number of Manufacturing in Services ( 5 to 10 km) 0.438 0.766 0.220 Number of Workers in Services ( 0 to 20 minutes) 0.504 0.703 0.430 Number of Workers in Health ( 0 to 20 minutes) 0.532 0.688 0.421 Number of Workers in Retail ( 0 to 20 minutes) 0.585 0.680 0.389 Number of Workers in Other ( 0 to 20 minutes) 0.605 0.672 0.345 Number of Workers in Services ( 0 to 5 km) 0.071 0.198 0.955 Number of Workers in Retail ( 0 to 5 km) 0.139 0.226 0.942 Number of Workers in Other ( 0 to 5 km) 0.190 0.325 0.871 Number of Workers in Health ( 0 to 5 km) 0.075 0.308 0.839 Number of Workers in Manufacturing ( 0 to 5 km) 0.289 0.354 0.699 Stochastic frontiers were developed for models of production. A production function is the ideal amount a unit can produce for a given set of inputs. In empirical settings observed outputs are not ideal ( maximum) for reasons that are due to unknown random factors and measurement error ( v) that are specific to each observed unit and due to productive inefficiency that also varies with each observed unit ( u). To examine the relationship between output FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 17 K. Goulias, T. Golob, and S. Y. Yoon variables ( access to opportunities) and input variables ( highways) a regression model is created with dependent variable ( y) the indicator of output and independent variables the highway lane kilometers ( x). The model we use here takes the following form: i i i i y = α + x' β + v − u Index i represents each tract, i= 1,…, 7049. We estimate three regression equations that are one for each of the three components of Table 2 ( core access, middle ring access, outer ring access). In each equation y is the logarithm of the component values for each tract. The xs are number of highways of each type in each geographic subdivision. The vector β contains the regression coefficients we seek. Variable v is the usual random error term capturing measurement error and variable u is a positive valued offset between observed access and the ideal maximum possible given the input combination of roadways within each tract. The random error term v is assumed to be normally distributed with zero mean and constant variance across observations. The random positive valued term u is specified as a function of other explanatory variables. In the terminology of production functions the values ui are the measures of inefficiency for each tract i in transforming lane kilometers of roadways into access to opportunities. Creating the exp(- ui ) we obtain a measure of tract specific efficiency. Estimation of the three models presented here is carried out using LIMDEP ( Greene, 2002). Table 3 shows the regression coefficients associated with each input variable ( number of lane kilometers of roadway types in the core, the middle ring, and the outer ring). The correlation between the y variable and its predicted values using the estimated model coefficients is 0.895 for the outer ring, 0.731 for the middle ring, and 0.744 for the core, representing excellent goodness of fit between data and the production function derived here. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 18 K. Goulias, T. Golob, and S. Y. Yoon Table 3 Stochastic Frontier Regression Coefficients Outer Ring Middle Ring Core Coeff. t ratio Coeff. t ratio Coeff. t ratio Constant - 0.413 - 3.13 0.857 13.80 1.685 17.89 Log( primary lim in core) - 0.094 - 1.71 0.203 11.17 0.443 13.01 Log( primary lim in core) 2 - 0.053 - 2.29 0.070 8.48 0.135 8.95 Log( primary nolim in core) 0.016 0.23 - 0.181 - 8.11 0.477 10.25 Log( primary nolim in core) 2 0.001 0.05 - 0.039 - 5.26 0.137 9.17 Log( secondary in core) 0.035 0.94 - 0.195 - 13.96 0.748 25.71 Log( secondary in core) 2 - 0.072 - 5.71 - 0.011 - 2.07 0.172 19.97 Log( local in core) - 0.101 - 3.75 0.091 8.28 - 0.160 - 7.86 Log( local in core) 2 0.021 2.89 0.020 6.55 - 0.100 - 20.05 Log( special in core) 0.068 1.21 - 0.190 - 10.05 - 0.145 - 4.59 Log( special in core) 2 0.045 2.02 - 0.050 - 5.91 - 0.103 - 6.92 Log( other in core) - 0.004 - 0.22 - 0.010 - 1.53 - 0.058 - 5.36 Log( other in core) 2 - 0.003 - 0.47 - 0.001 - 0.59 - 0.024 - 6.66 Log( primary lim in middle ring) 0.098 2.33 - 0.020 - 1.07 - 0.115 - 3.70 Log( primary lim in middle ring) 2 0.077 5.83 - 0.036 - 6.60 - 0.055 - 6.56 Log( primary nolim in middle ring) 0.048 3.44 0.039 9.50 - 0.082 - 9.54 Log( primary nolim in middle ring) 2 0.028 5.13 0.003 1.69 - 0.047 - 13.76 Log( secondary in middle ring) - 0.155 - 3.18 0.146 6.08 - 0.249 - 6.01 Log( secondary in middle ring) 2 - 0.065 - 6.13 0.044 9.09 - 0.062 - 7.06 Log( local in middle ring) 0.025 0.63 - 0.014 - 0.69 0.059 1.69 Log( local in middle ring) 2 0.015 2.14 - 0.020 - 6.01 0.058 10.11 Log( special in middle ring) - 0.083 - 1.78 0.085 4.37 - 0.009 - 0.25 Log( special in middle ring) 2 - 0.071 - 5.22 0.061 11.10 0.012 1.28 Log( other in middle ring) 0.034 1.76 - 0.005 - 0.69 0.042 3.13 Log( other in middle ring) 2 0.021 4.80 0.006 3.93 0.023 8.69 Log( primary lim in outer ring) 0.077 1.47 - 0.012 - 0.36 0.002 0.03 Log( primary lim in outer ring) 2 - 0.051 - 2.56 0.025 2.71 - 0.003 - 0.20 Log( primary nolim in outer ring) - 0.071 - 2.18 0.045 3.16 0.041 1.70 Log( primary nolim in outer ring) 2 0.007 0.75 - 0.018 - 3.85 0.000 - 0.06 Log( secondary in outer ring) - 0.041 - 0.66 0.006 0.17 - 0.008 - 0.13 Log( secondary in outer ring) 2 0.030 2.27 - 0.019 - 2.65 0.010 0.82 Log( local in outer ring) 0.066 1.40 - 0.062 - 1.73 0.006 0.12 Log( local in outer ring) 2 0.007 0.90 0.003 0.62 - 0.010 - 1.33 Log( special in outer ring) - 0.090 - 1.80 0.058 1.97 0.009 0.19 Log( special in outer ring) 2 0.093 6.18 - 0.019 - 2.72 0.006 0.51 Log( other in outer ring) 0.012 0.47 0.005 0.42 0.018 0.82 Log( other in outer ring) 2 - 0.025 - 5.63 0.002 0.74 - 0.008 - 2.18 Constant for u - 0.718 - 8.06 - 17.693 - 14.36 - 0.144 - 3.18 Household density - 0.578 - 69.66 1.059 10.34 Tract perimeter ( km) - 1.375 - 22.05 u v σ / σ 3.797 28.05 13.069 17.89 2.612 45.34 2 2 u v σ = σ + σ 0.680 150.31 1.359 17.65 0.468 77.43 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 19 K. Goulias, T. Golob, and S. Y. Yoon The signs, size, and significance of the regression coefficients show how the presence and amount of different types of roadways impact the ability of each geographical tract to provide access to opportunities. A negative sign associated with roadways in the same region ( core, middle ring, outer ring) of the dependent variable is more likely to indicate competition for space with businesses and establishments providing services. A positive coefficient is more likely to indicate a clustering of establishments around those roadway types. Positive coefficients associated with variables in different regions than the dependent variable indicate a supportive relationship with access. For example, access to the outer core may be achieved by driving over local roads in the core, secondary roads in the middle ring, and again local roads in the outer ring. Different establishments however, may be reached by different combinations of roadways. As a result we obtain a variety of significance levels, signs, and sizes of coefficients that may not all correspond to intuition. As expected, access to the outer core is influenced by roadway quantity in the core, the middle ring and the outer core. However, lower speed facilities in the core ( local and secondary roadways) seem to have a stronger influence than the higher speed ( primary roadways). The middle ring primary roadways have a strong positive impact on access in the outer ring. These two indications are a reflection of the routes leading to the outer core with high presence of opportunities. However, if there are many primary roadways in the outer core they compete for space with the establishments were opportunities locate and this is reflected in a few negative coefficients associated with roadways in the outer ring ( primary nolim and secondary). Access to the middle ring is even more heavily influenced by the amount and type of roads in the core ( positively by high speed roadways and negatively by lower speed roadways). The core access is not influenced by roadways in the outer ring, i. e., a driver does not need to go into the outer core when reaching places within the 5 km radius around a tract centroid and this is reflected in the lack of significance for most of the outer ring variables. In contrast, primary roadways in the middle ring seem to decrease access to the core in a significant way. This is a reflection of the spatial organization of California’s roadway network and the spatial distribution of activity opportunities adjacent to the network's roadways. Unfortunately, all this is also masked by the use of the summary indicators ( i. e., the principal components) as dependent variables that contain variables from all three regions ( i. e., core, middle ring, and outer ring). FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 20 K. Goulias, T. Golob, and S. Y. Yoon When aiming at improving access to opportunities around the core, however, provision of primary and secondary roadways appears to be a worthwhile investment. When we examine the other two components that are heavily influenced by variables that include travel time, the picture is not as clear and may be pointing out to the need for improving travel times in local and secondary roadways in regions that lead to the middle and outer rings. The bottom portion of Table 3 contains the estimates of variables influencing inefficiency. Exp (- ui ) is a measure of technical efficiency and it is the ratio between achieved access over the maximum possible access achieved for the given inputs. The outer ring and middle ring efficiencies ( and their opposite inefficiencies) are significantly different among tracts of different household densities ( households per square kilometers). The core efficiency is a function of the perimeter of the tract indicating a possible problem with the use of tract as a unit of analysis. In a series of other specifications not shown here we also find that multi- car (> 4) households live in tracts with lower efficiency presumably because they are able to combat lack of access with automobility. Other variables considered such as number of households by household size did not exhibit a clear trend. The median efficiency indicators are fairly high at 84%, 92%, and 81% for the outer ring, middle ring, and core respectively. The tenth lower percentiles are 72%, 83% and 62% for the outer ring, middle ring, and core respectively indicating a fairly good efficiency for a system that evolved without a major plan targeting high efficiency. However, considering the large size of many tracts access to opportunities may be quite different among the residents within these tracts ( see also the inequality section below). The final examination we perform for these computed efficiencies here is by mapping them for the entire state. Figure 3 shows the three efficiency indicators for Los Angeles, California, using as cutoff points the 10% percentiles. The first quadrant shows the Los Angeles total lane kilometers of roadways. Each efficiency estimate captures a different aspect of access to locations and shows clearly that providing more lane kilometers does not make a geographical area more accessible for any of the three efficiency measures. These same efficiency estimates were also computed for the entire state. Figure 4a shows the core efficiency map at 10% percentile increments. Figure 4b shows the middle ring efficiency and Figure 4c shows the outer ring efficiency. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 21 K. Goulias, T. Golob, and S. Y. Yoon Figure 3: Maps of lane kilometers and efficiency measures in Los Angeles, California Core efficiency Outer ring efficiency Middle ring efficiency Total lane kilometers within core FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 22 K. Goulias, T. Golob, and S. Y. Yoon Figure 4a Core Efficiency Estimates Core efficiency FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 23 K. Goulias, T. Golob, and S. Y. Yoon Figure 4b Middle Ring Efficiency Estimates Middle ring efficiency FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 24 K. Goulias, T. Golob, and S. Y. Yoon Figure 4C Outer Ring Efficiency Estimates Outer ring efficiency FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 25 K. Goulias, T. Golob, and S. Y. Yoon 4. Inequality Assessment In this section a method to highlight the mismatch that exists between the distribution of the population and the allocation of roads and activity access in California is presented. The tool we aim with the analysis presented here identifies specific locations in the state where resource allocation has succeeded in offering a uniform spatial spread of benefits to the public. In addition, we aim to develop maps that show which locations in a country ( a state in our study) fail to be equitable, requiring their residents to travel excessively to pursue the same amount of activities as other residents of different localities. In this section, we answer a few key questions: Using largely available data, can we develop a small number of variables to describe access to activity opportunities for California residents? Is it possible to capture the structure of inequality in accessibility through a multi- scale analysis? Can we identify areas that are prime candidates for investment? To answer these questions the state of California is divided in 22,133 zones using the US Census 2000 block groups. The Census block group ( unit of analysis here) is selected as a first order geographical subdivision to make the analysis tractable at the state level and to provide sufficient detail to be meaningful. We assess each block group in terms of its ability to produce benefits for its residents and compare each block group with other block groups within a census tract. We will repeat the same comparison using tracts within counties, and counties within the state. Figure 5 provides a schematic representation of the study FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 26 K. Goulias, T. Golob, and S. Y. Yoon Figure 5: Computation Schema of the Inequality Study Table 4 contains a selection of unit of analysis characteristics. Access to opportunities for activity participation ( e. g., leisure) and services ( e. g., health) is the benefit ( and output) from each tract that we will assess. As indicators of available opportunities in a block group, numbers of workers classified according to the North American Industry Classification System ( NAICS) were used. The original NAICS classification of fourteen types of industries was aggregated into five types: retail, health, services, manufacturing, and all other considering the types of activity in which people can participate related to the industries. Using Geographic Information Systems ( Network Analyst in ArcGIS 9.1), we identified the areas reachable within 20 minutes, 40 minutes, and 60 minutes travel time using information about speed limits on the roadway network at hand. The network data we used for the analyses have information about types of road network, segment length, speed limit, turning restrictions, and one way street enabling a somewhat realistic modeling of the travel environment. Identification of the reachable areas is accomplished by developing two sets of shortest path networks for the origin- destination matrix of the block group centroids using travel time and travel distance as travel cost respectively, and querying the block groups by the travel costs. Combining the reachable areas with the numbers of workers in each block group, accessibility to Assemble data for the 22,133 block groups of California from US Census 2000 and a 2000 Vintage Network of Roadways Sum the number of jobs within each buffer Sum the number of lane km within each buffer Compute buffers at 5, 10, and 50 km and 20, 40, and 60 minutes using shortest path Develop summary indicators of accessibility Create THEIL indicators for block groups, tracts, and counties FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 27 K. Goulias, T. Golob, and S. Y. Yoon activity participation was calculated as enumeration of workers of each industry within each reachable area. Table 4 A selection of block group characteristics Mean Std. Dev. Maximum* Block Group Square Km 18.51 179.59 12219.12 Block group Population 1530.3 1008.48 36146 Block group Households Within a 20 min travel time buffer from block group Centroid Workers in Retail ( retail) 56324.49 48926.91 202513 Workers in Health ( health) 96664.34 89718.16 389816 Workers in Services but not in Health or Retail ( services) 23812.89 23757.93 87798 Workers in Manufacturing ( manufacturing) 80640.04 88937.65 339848 Workers in All Other Occupations ( other) 75843.44 68947.56 270979 Primary limited access roadways ( primary lim) 266.53 206.05 885.86 Primary without limited access roadways ( primary nolim) 78.4 82.01 552.42 Secondary and connecting roadways ( secondary) 650.52 425.51 2333.31 Rural, local and neighborhood roadways ( local) 2561.13 1782.39 12545.59 Special roadways ( special) 23.2 39.44 483.4 All Other types of roadways ( other) 223.78 275.34 1984.31 * The minimum is zero for all variables and tracts In a similar way as was done for the tract level, transportation supply is represented by the amount of roadways ( lane kilometers) by type ( e. g., limited access freeways/ motorways, secondary roads connecting limited access roadways, local roads) but this time measured at the level of a US census block group. Using Geographic Information Systems, we can identify and count the number of kilometers of each roadway type in each block group. Roadways, however, form a complex network interconnecting the block groups and through the roadway network the block groups provide activity opportunities to others and also get benefits from others. For this reason, we perform a similar task as for activity opportunities and we sum up the length of roadway segments by type in a series of concentric areas that are accessible in 20 minutes, 40 minutes, and 60 minutes of travel time to quantify roadways that are available from an origin that is considered here as a virtual center of the block group ( named centroid). We name these areas the buffers ( similarly to the process followed in the previous section). The types of roadways we count are: primary highways with limited access ( primary lim herein), primary roadways without limited access ( primary nolim herein), secondary and connecting roadways ( secondary herein), FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 28 K. Goulias, T. Golob, and S. Y. Yoon local and rural roads ( local herein), roads with special characteristics ( special herein), all other roadways ( other herein). On one hand, we have as input a detailed accounting of roadways representing all past investment on highways for each origin and the number of workers a resident departing from a centroid can reach. These counts are the indicators capturing access to opportunities to participate in activities and enjoy services. On the other hand, the main beneficiaries of transportation policies are the number of persons residing in an origin block group. One objective in transportation is to maximize accessibility for most persons. However, some segments of the population receive lower benefits than others. Inequality assessments are needed then to make comparisons. The assessment of inequality is very often limited to a few disadvantaged population segments ( Blumenberg, 2008 - http:// www. opportunitycars. com/ articles/ documents/ 20051205_ Blumenberg. pdf) and they do not encompass an entire state or country in their assessment. In contrast, inequality is a very popular subject in other fields ( Krugman and Venables, 1995, Schneider et al., 2002, Ghose, 2004). Considering the strong spatial correlation among accessibility indicators ( due to the connectivity of highway network and the agglomeration of businesses) we opt for an index of inequality that has a " fractal" nature ( i. e., decomposable geographically) and that can handle multiple output variables. The output of the number of workers that a resident departing from a centroid can reach is depicted by 25 indicators that are: number of workers in retail, health, services, manufacturing, and other employment within 5km, within 10 km, within 50 km, within 20 minutes of travel time, within 40 minutes of travel time, and within 60 minutes of travel time. To reduce the data into a few variables we use factor analysis using the principal components method and extraction based on correlations in the same manner we did for the tracts in the efficiency analysis. During a first stage using all the variables this method produced a few variables that were only marginally informative ( as expected due to the strong relationship among the 25 variables considered here) and they were eliminated from further analysis. The reduced set of variables considered in this analysis produced only one factor that captures 90.03% of the variation in the variables used here. Table 5 provides a summary of the component scores ( high scores indicate high correlation between the output variable and the component extracted). For each California block group we compute this “ accessibility “ factor ( ai, i= 1,…, 22,133). Figure 6 shows the ratio FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 29 K. Goulias, T. Golob, and S. Y. Yoon ai/ ni with ni the resident persons in each block group. These figures show the disparities that exist in providing accessibility at each block group. The figures, however, do not reflect the relationship of accessibilities between block groups and do not provide an indicator that compares them directly with the overall accessibility of the state and its spatial structure. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 30 K. Goulias, T. Golob, and S. Y. Yoon Table 5 The factor created using a reduced set of the 25 output variables and their scores Variable Loading for accessibility factor NUMBER OF WORKERS IN MANUFACTURING INDUSTRY ( WITHIN 20 MINUTE BUFFER) 0.8669 NUMBER OF WORKERS IN RETAIL INDUSTRY ( WITHIN 20 MINUTE BUFFER) 0.9233 NUMBER OF WORKERS IN EDUCATION/ HEALTH SERVICE INDUSTRY ( WITHIN 20 MINUTE BUFFER) 0.8957 NUMBER OF WORKERS IN OTHER INDUSTRY ( WITHIN 20 MINUTE BUFFER) 0.8595 NUMBER OF WORKERS IN MANUFACTURING INDUSTRY ( WITHIN 40 MINUTE BUFFER) 0.9675 NUMBER OF WORKERS IN RETAIL INDUSTRY ( WITHIN 40 MINUTE BUFFER) 0.9881 NUMBER OF WORKERS IN EDUCATION/ HEALTH SERVICE INDUSTRY ( WITHIN 40 MINUTE BUFFER) 0.9828 NUMBER OF WORKERS IN PUBLIC ADMINISTRATION INDUSTRY ( WITHIN 40 MINUTE BUFFER) 0.9538 NUMBER OF WORKERS IN OTHER INDUSTRY ( WITHIN 40 MINUTE BUFFER) 0.9757 NUMBER OF WORKERS IN MANUFACTURING INDUSTRY ( WITHIN 60 MINUTE BUFFER) 0.9640 NUMBER OF WORKERS IN RETAIL INDUSTRY ( WITHIN 60 MINUTE BUFFER) 0.9719 NUMBER OF WORKERS IN EDUCATION/ HEALTH SERVICE INDUSTRY ( WITHIN 60 MINUTE BUFFER) 0.9700 NUMBER OF WORKERS IN PUBLIC ADMINISTRATION INDUSTRY ( WITHIN 60 MINUTE BUFFER) 0.9490 NUMBER OF WORKERS IN OTHER INDUSTRY ( WITHIN 60 MINUTE BUFFER) 0.9704 PRIMARY HIGHWAY WITH LIMITED ACCESS( WITHIN 20 MINUTE BUFFER) 0.9313 LOCAL, NEIGHBORHOOD, and RURAL ROAD( WITHIN 20 MINUTE BUFFER) 0.9000 PRIMARY HIGHWAY WITH LIMITED ACCESS( WITHIN 20 MINUTE BUFFER) 0.9852 LOCAL, NEIGHBORHOOD, and RURAL ROAD( WITHIN 20 MINUTE BUFFER) 0.9798 PRIMARY HIGHWAY WITH LIMITED ACCESS( WITHIN 40 MINUTE BUFFER) 0.9570 SECONDARY and CONNECTING ROAD( WITHIN 40 MINUTE BUFFER) 0.9478 LOCAL, NEIGHBORHOOD, and RURAL ROAD( WITHIN 40 MINUTE BUFFER) 0.9439 SECONDARY and CONNECTING ROAD( WITHIN 40 MINUTE BUFFER) 0.9760 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 31 K. Goulias, T. Golob, and S. Y. Yoon Accessibility per capita in each block group Theil index contribution by each block group Figure 6 Accessibility and Theil Maps Under ideal data availability we would like to identify every resident of California, compute an accessibility index associated with each resident and then perform a comparative analysis to assess who enjoys higher accessibility and who does not. Although this is not an impossible task with today's modeling and simulation capabilities, it violates one of the initial requirements of this study of using largely available data to explore new techniques. In addition, accessibility of one location is related to the accessibilities of its neighbors. We start with block group subdivisions and compute an indicator that accounts for the distribution of accessibility. We then consider increasingly larger geographical areas to illustrate the use of the Theil index. The following equation shows the Theil index computed using the block group data in California. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 32 K. Goulias, T. Golob, and S. Y. Yoon Where A is the sum of factor values for the entire state of California ( A= Σak) and N the population of the entire state of California ( N= Σnk ). For each block group i, we name respectively accessibility share and population share the ratios ai/ A and ni/ N . An important advantage of this index over other measures of inequality is its composition. Each component of the sum in the equation above is a weighted log ratio of the accessibility over the resident population in the block group. Each component in the Theil index is then a weighted measure of the mismatch between its accessibility share and its population share. Thus, our interest will focus on each term of the sum, which we name contribution of the block group to the Theil index, or Theil contribution. The right hand side of Figure 6 displays these Theil contributions for each block group. This map is more instructive than the left hand side one since the block groups are compared to each other which allows to identify the relative status of each area as compared to the entire state in possible mismatches. The block groups colored in yellow are those that bring little of no contribution the Theil index. That means they can enjoy accessibility to roads and activities opportunities in the right proportion with respect to their population. On the other hand, the green colored areas are those that have an accessibility share higher that their population share offering excess advantage. This " over accessibility" is on the detriment of the red color areas for which the accessibility share is smaller than the population share. Consequently the inhabitants of those block groups may have to spend more travel time to accomplish the same amount of every day activities than their counterparts residents who live in advantaged areas. As far as infrastructure investment is concerned, a public policy aiming at an homogenous development of the state of California should consider the red colored areas as prime candidates for roadway connectivity funding allocation ( of course other factors are usually taken into account in allocating resources). Figure 6 shows that major metropolitan areas such as Los Angeles are particularly advantaged in accessibility, but it seems that this over- accessibility was built at the detriment of the block groups that compose their outskirts and the ones that are situated in the log 22,133 1 i ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ = Σ= N n A a A T a i i i FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 33 K. Goulias, T. Golob, and S. Y. Yoon central part of the State. It should be noted, however, that travel time here is computed based on the speed limit of roadways and therefore does not account for congestion. As a result this " advantage" of the urban core is somewhat exaggerated in this analysis. A fractal version of Theil's index enables assessment of inequality across larger regions as well as within larger regions to account for highway and land use connectivity. This is indeed the main characteristic that made us prefer the Theil index to all the other indexes developed in the economics literature. It is decomposable through different levels ( e. g., geographical scales) and considers, for each scale unit, a between unit component and an intra unit component. In this way we can also account for heterogeneity within a larger area. As already mentioned above, a better way to measure inequality would be to consider each resident, but since, as most of analysts, we are dealing with groups, we have to study how inequality emerges between and inside these groups. Moreover this fractal approach gives us a deeper understanding of the spatial structure of inequality through the different levels we study. In our case study, the different geographical units we consider are the following: the County, the Tract and the Block group. Figure 7 displays the tree structure of the recursive calculation here. The state is composed of 58 counties. Each county contains tracts and each tract contains block groups. The general definition of the fractal Theil index is the following ( Conceicao and Ferreira, 2000). Where ai is the accessibility of the branch i of the root r, A the total accessibility of the root r, ni the population of the branch i, N the total population of the root and T( r, i), the Theil index of the branch i. ( r, i) Branches i 1 Branches i i 1 i . T A . log a A Σ a Σ = = + ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ = N n A a T i i r FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 34 K. Goulias, T. Golob, and S. Y. Yoon Applied to our case study, the formula becomes: Decomposition of the state by each county Decomposition of each county by each tract within the county Decomposition of each tract by block group within the tract Figure 7: Tree structure used for the computation of the fractal Theil index. California County 1 County 2 County N Tract 1 Tract 2 Tract M Bg 1 Bg 2 Bg 3 County i NC i 1 CA NC i i 1 CA i . T A . log a A Σ a Σ = = + ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ = CA i CA i CA N n A a T Tract j Tract j of County i Couty i j i Couty i j Tract j of Counnty i Couty i j i . T A A a a . log A Σ a + Σ ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ = County j County N n T ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ = Σ Tract k Tract j k Blockgroup k of Tract j Tract j k j A a . log A a N n T k Tract FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 35 K. Goulias, T. Golob, and S. Y. Yoon Bringing all these components into one equation leads to the following. TCA = Between Countiescontribution Intra county contribution Between tracts contribution Intra- tract contribution Figures 8, 9 and 10 display the results from this equation. Figure 10 is a statewide summary that displays two kinds of information. First, the contribution of the County to the Theil index, i. e the measure of the mismatch that exists between its accessibility share and its population share toward the other Counties. The other information is an “ intra County” contribution that is actually its own Theil index and measures the inequality that exists between and inside its own tracts. Consequently, this map allows us to see, not only how advantaged or disadvantaged a County can be in regard to the others but also if its resources have been equally or unequally allocated showing the main advantage of the Theil index. It allows to understand the structure of inequality and its distribution through different geographic levels, and can thus constitute a decision making tool for public policies. Indeed, this map enables a policy maker to identify both what are the areas that need the most transport infrastructure for an egalitarian development of the State, and which regions have allocated their investments to projects that grant an homogeneous development of their own territory. The map allows to decide if a statewide equality will be emphasized and investments need to be made accordingly or if combating local inequality is more important and investments need to be made at a more local and focused way. A a . log A . a A A a a . log A . a A . log a A a Tract k Tract j k Blockgroup k Tract j k Tract j Couty i j i Couty i j Tract j Couty i NC j i 1 CA NC i i 1 CA i ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ + ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ + ⎟ ⎟ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎜ ⎜ ⎝ ⎛ = Σ Σ Σ Σ Σ = = N n N n N n A a T k County j CA i CA i CA FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 36 K. Goulias, T. Golob, and S. Y. Yoon Figure 8 Decomposition of the Theil contribution of each California County Figure 9 Decomposition of the Theil contribution of each California County without Los Angeles ( 37) and Orange ( 59) - 0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 1 11 21 31 41 51 61 71 81 91 101 111 County Decomposition of the Theil contributions for each County between county contribution intra county contribution - 0.03 - 0.025 - 0.02 - 0.015 - 0.01 - 0.005 0 0.005 0.01 0.015 1 9 17 25 33 43 51 61 69 77 85 93 101 109 County Decomposition of the Theil contribution for each County between intra FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 37 K. Goulias, T. Golob, and S. Y. Yoon Figure 10 Map of the decomposition of the Theil contribution of each County FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 38 K. Goulias, T. Golob, and S. Y. Yoon In California the most evident phenomenon that appears is the supremacy of the County of Los Angeles and Orange County in terms of both “ over accessibility” ( with, for LA, a contribution almost 45 times larger than the one of the third most advantaged County) and intra inequality ( an intra inequality index about ten times larger than the one of the third most inhomogeneous County). This illustrates a property of the Theil index, its sensitivity to distributional impacts and disparities among the groups considered and in particular to " wealth" transfers from the disadvantaged to the advantaged. The measure of the mismatch is indeed amplified by the accessibility share weight ( Conceicao and Ferreira, 2000, pages 12 and 13). Of course there is also a scale effect in all this. The larger a County is, the more likely it is to have internal heterogeneity. Among the other Counties, there is another trend that is worth noting. Counties that show the most lack of accessibility are also those with the highest intra- county inequality. This points out to the need for a more detailed study to identify those disadvantaged counties that did not benefit from large scale infrastructure investment such that would have allowed them to develop a coherent policy for an homogeneous development of their territory. The findings here show some sort of negative feedback; the less investment a County receives, the more it is likely to suffer from territorial disparities. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 39 K. Goulias, T. Golob, and S. Y. Yoon 5. Microanalysis ( Person Based) Analysis In the development of the microanalysis in this project, we have identified relationships between travel, household sociodemographic characteristics, spatial accessibility, and road infrastructure. When considered separately, sociodemographic characteristics, spatial accessibility, and road infrastructure all influence travel behavior. Dense urban areas make walking trips more feasible; extensive networks of freeways and arterials encourage vehicular trips; large households make more trips per day than small households, and so on. However, in the real world, all of these variables interact simultaneously. Households consider the costs and benefits of different locations and feasible travel modes in light of their circumstances, and choose residential locations accordingly. Indeed, one could argue that households are not merely reacting to their circumstances, but rather are actively trying to improve their lot in any way they can. Adjustment strategies include moving residence, changing jobs, choosing different travel destinations, bundling individual single- occupancy vehicle ( SOV) trips into high- occupancy vehicle ( HOV) trips, and so on. One cannot merely consider the influence of spatial infrastructure characteristics in isolation and this motivates the development of regression analyses attempting to take into account multiple factors. One source of information about individuals and their households is the California Statewide Travel Survey, conducted over several months in the years 2000 and 2001. It provides an excellent starting point for disentangling the relationships between space, infrastructure, and sociodemographics. The survey sample, consisting of more than 17,000 households, is a quota sample by county and planning region, rather than a representative sample of California proportional to the population of each county. Each trip destination has been geocoded, usually to the nearest intersection, but sometimes to the approximate census tract centroid. The location ( geocodes and census tract) of almost every household can also be determined from the survey data. To this data have been added spatial accessibility variables and roadway infrastructure variables by census tract and block group computed in the efficiency and inequality analysis discussed in previous sections of this report. The relatively even distribution of the sample across all California counties ensures that the data represent a wide variety of spatial environments. The need to account for ( control) for sociodemographics when assessing relationships between travel behavior and spatial factors is revealed when we investigate the residential FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 40 K. Goulias, T. Golob, and S. Y. Yoon location patterns of sociodemographic groups. Through a series of statistical tests, we determined that eight categorical sociodemographic variables were paramount in explaining travel behavior. These variables with their categories and distributions by percent of the sample are listed in Table 6. Each of these five variables was strongly related to our spatial variables through residential location. Table 6 Sociodemographic Variables Used in the Models Variable % Variable % Variable % Annual Household income Average age of heads Highest education of head <$ 10,000 4.3 18- 25 5.8 not high school 9.1 $ 10,000-$ 24,999 14.2 25.5- 35 14.1 high school graduate 24.5 $ 25,000-$ 34,999 13.2 35.5- 45 20.1 Some college 23.7 $ 35,000-$ 49,999 13.9 45.5- 55 22.7 associates degree 7.4 $ 50,000-$ 74,999 19.9 55.5- 65 15.5 bachelors degree 20.9 $ 75,000-$ 99,999 10.9 65.5- 75 11.8 graduate degree 13.4 $ 100,000-$ 149,999 7.4 75.5+ 7.5 Unknown 1.1 $ 150,000+ 3.4 Unknown 2.5 Whether any children < 6 unknown 12.8 Ethnicity of heads Yes 7.5 Household size White 75.5 No 89.4 1 26.4 Hispanic 10.2 Whether any children 6- 12 2 40.8 Black 2.3 Yes 9.3 3 14.4 Asian/ Pacific Islander 1.9 No 85.6 4 11.2 White & Hispanic 3.1 Whether any children 13- 17 5 4.7 White & Asian 1.3 Yes 9.0 6 or more 2.5 Other or unknown 5.8 No 2.9 The residential location patterns for various demographic groups can be seen by graphing the category means of four key spatial variables for each of our five polychotomous sociodemographic variables, as shown in Figures 11 through 15. These four spatial variables are: ( 1) housing density, in terms of households per square kilometer, ( 2) regional retail accessibility, in terms of total retail workers within 10 and 50 kilometers, ( 3) local road infrastructure, in terms of total kilometers of local, neighborhood, and rural roads within 10 kilometers, and ( 4) regional non- freeway primary road infrastructure, in terms of total kilometers of primary roads without limited access within 10 to 50 kilometers. Each of these four spatial variables are standardized ( zero mean and standard deviation of one) to allow plotting on a single scale. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 41 K. Goulias, T. Golob, and S. Y. Yoon Figure 11 Some Spatial Variable Means by Categories of Household Income Figure 12 Some Spatial Variable Means by Categories of Household Size - 0.20 - 0.10 0.00 0.10 0.20 0.30 0.40 <$ 10,000 $ 10,000- $ 24,999 $ 25,000- $ 34,999 $ 35,000- $ 49,999 $ 50,000- $ 74,999 $ 75,000- $ 99,999 $ 100,000- $ 149,999 $ 150,000+ Household annual income standardized variable category mean households per sq km in tract retail workers within 10- 50 km local roads within 10km primary roads w/ o limited access within 10- 50 km - 0.20 - 0.15 - 0.10 - 0.05 0.00 0.05 0.10 0.15 0.20 1 2 3 4 5 6 or more Household size standardized variable category mean households per sq km in tract retail workers within 10- 50 km local roads within 10km primary roads w/ o limited access within 10- 50 km FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 42 K. Goulias, T. Golob, and S. Y. Yoon Figure 13 Some Spatial Variable Means by Average Age of Household Heads Figure 14 Spatial Variable Means by Ethnicity of Household Heads - 0.15 - 0.10 - 0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 18- 25 25.5- 35 35.5- 45 45.5- 55 55.5- 65 65.5- 75 75.5+ Average age of household heads standardized variable category mean households per sq km in tract retail workers within 10- 50 km local roads within 10km primary roads w/ o limited access within 10- 50 km - 0.20 0.00 0.20 0.40 0.60 0.80 1.00 1.20 white hispanic black asian/ Pacific islander white & hispanic white & asian other or unknown Ethnicity of Household head( s) standardized variable category mean households per sq km in tract retail workers within 10- 50 km local roads within 10km primary roads w/ o limited access within 10- 50 km FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 43 K. Goulias, T. Golob, and S. Y. Yoon Figure 15 Spatial Variable Means by Education of Household Heads The income dimension of residential location of households is most strongly related to regional retail accessibility ( Figure 11). With the exception of the lowest income groups, higher income households tend to be located in areas surrounded by the highest retail activity. Higher income households ( households with incomes of $ 100,000 or more in 2000 dollars) are also located in regions with the highest levels of primary road infrastructure. With respect to local road infrastructure, households in the highest and lowest income classes tend to be located in areas with the greatest density of local roads; there is no statistically significant difference ( p = .01) between local road densities among the six middle income classes. Finally the only statistically significant relationship between income and housing density is that the lowest income households reside in denser census tracts; otherwise income is not a factor in housing density. With regard to household size, all spatial effects involve single- person households as distinguished from multi- person households. For each of the four spatial variables graphed in - 0.40 - 0.30 - 0.20 - 0.10 0.00 0.10 0.20 0.30 0.40 not high school grad high school some college associates degree bachelors degree graduate degree Highest education of household head standardized variable category mean households per sq km in tract retail workers within 10- 50 km local roads within 10km primary roads w/ o limited access within 10- 50 km FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 44 K. Goulias, T. Golob, and S. Y. Yoon Figure 12, there is no statistically significant difference among multi- person households of different sizes. As opposed to income, household density at the tract level is most strongly related to household size. The patterns of residential location as a function of age of the household head( s) revolves around decreases in density by age groups up until the 45.5- 55 category, after which there are no statistically significant effects ( Figure 13). The strongest relationship is that between age and housing density, followed by a moderately strong relationship between age and the density of local road infrastructure. Black and Asian households tend to locate in the highest density areas, in terms of all four spatial measures ( Figure 14). In terms of one of these variables, local road infrastructure, Black households reside in areas that are even denser than Asian households. Excluding Black and Asian households, there are still statistically significant differences between the other ethnic groups. Hispanic and mixed White and Asian households live in areas that are denser than those resided by White and mixed White and Hispanic households. The residential location patterns of households according to education of household head, which is also a proxy for occupation, are shown in Figure 15. The strongest relationship is for regional primary road infrastructure. Less educated households tend to reside in areas with the greatest regional coverage of surface arterial primary roads. Households with associated degrees reside in areas with the lowest coverage of regional coverage of surface arterials, and the same low density for those with associate degrees is true for the other dimensions, especially regional retail accessibility. There is a similar, but less pronounced pattern for local roads. Finally, households in the highest education segments reside in areas with high residential density, compared with households in the middle education segments. The presence of young children does not appear to be a major factor in residential location, as there are no significant relationships involving between the indicator variable of young children and any of our four key spatial variables. Households with children aged six through twelve tend to be located in areas with lower housing density. There are no statistical relationships with retail accessibility or local or regional primary arterial road coverage. Households with children aged six through twelve tend to be located in areas with lower housing density, with lower regional retail accessibility, and in areas with lower coverage of roads. This discussion points to the need for further analysis that accounts for these sociodemographics and goes at least one step further in the spatial unit within which these households reside to study the FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 45 K. Goulias, T. Golob, and S. Y. Yoon impact of all this on travel. We do the remaining analysis with regression models that account for multiple influences on travel behavior. 6. Microanalysis Using Regression Models In each of the models that follow, three blocks of variables are tested: ( 1) the same set of sociodemographic variables, ( 2) residential and activity site density variables, and ( 3) any road infrastructure variables found to be significant in explaining the dependent travel behavior variable after controlling for the first two sets of variables. Spatial variables were derived using buffer areas ( e. g., around the population centroid of a census tract such as retail employees within 10 km of a census tract). Several such measures were developed, using both time and distance to define the boundary of the buffer. Based on preliminary data exploration only the 10 km and 50 km buffer variables were found to have a substantial effect and are used here. Some shorter time buffers could have been used and would have produced similar results, but the 10km and 50km distances were found to be more effective in capturing the influence of infrastructure provision and access to activity opportunities. The shortest distance buffer zone indicators are tested both in direct and difference ( ring) format. Modeling the contribution of spatial accessibility and infrastructure density was further complicated by the presence of spikes at zero and long positive tails. For example, some rural census tracts in California are extremely large with a very small population concentrated in a small portion of the tract. These need to be modeled together with census tracts that have some of the highest densities of roadway infrastructure in the nation. To overcome this distributional heterogeneity, spatial variables were converted to a scale in which the population was ranked into ten groups of equal frequency ( deciles). This relieves the estimation bias caused by outlying observations and restrictions to the positive domain with spikes at zero value. It also facilitates estimation in which the spatial variables can contribute nonlinear and even non- ordinal effects. We present omnibus tests of each set of variables, but the variable coefficients are shown only for the final complete model. These coefficients are displayed as odds ratios; the raw coefficient can be computed as the natural logarithm of the odds ratio. To aid in interpretation, only statistically significant ( p = .05) coefficients are listed. All variables are categorical, and the continuous spatial variables are discretized into ten equal categories ( deciles). FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 46 K. Goulias, T. Golob, and S. Y. Yoon In the following sections we present the results of six sets of models aimed at assessing the influence of the spatial environment on travel demand in California. The first model identifies which households contain adults ( persons eighteen and older) who are non- drivers, as this special group is an important component of passenger, transit, and nonmotorized demand. The second set of models deals with public transport ( transit) demand, and we estimate separate models for transit demand by any household member and by adult drivers only. Similar sets of models are then estimated for nonmotorized travel and for high occupancy vehicle ( HOV) travel. The latter set also contains a model of the HOV travel time. The final set of models is for solo driving, with one model for household solo driving demand, and one model for solo driving distance. We also analyze the impact of spatial aggregation ( e. g., tract level measurement versus block group measurement) level on power of the models. The same procedure of variable computation was conducted using block groups, which are smaller than census tracts, and the same six sets of models were built using the block group variables. The potentially deleterious impact may arise from the modifiable areal unit problem ( MAUP; Openshaw and Alvanides, 1999). MAUP is one of the important issues that should be considered when we use GIS. Artificial boundaries imposed on continuous geographical phenomena, such as accessibility, results in the generation of artificial spatial patterns, and the spatial patterns generated in different levels of spatial aggregation differ from each other. We analyze the existence and the impact of MAUP in the six sets of travel behavior models and show how spatial variables at different aggregation levels can be used in the models to mitigate this artificial spatial resolution considering the impact of unit area sizes. In the models that follow we show estimation results using census tract accessibility variables and sociodemographics and estimation results using combinations of block group level variables. 6.1 Adults Who Do Not Drive A substantial portion of household travel behavior that does not involve driving is due to the constrained choices of non- driving adults. California has a reputation for being an automobile-oriented state. While this reputation may be somewhat unfair, it is indeed the case that most of the state was developed over the past 50 years, and so it is designed and developed with the automobile in mind. This contrasts sharply with older east coast or European cities. One might FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 47 K. Goulias, T. Golob, and S. Y. Yoon presume therefore that households with non- driving adults might choose to locate in denser areas, where walking, bicycling and public transport are well supported. We test this hypothesis in the model presented in this section. This model uses 16,949 observations, or 99.5% of the sample with complete data. In this sample, 11.6% of households have non- driving adults. 6.1.1 Census Tract Model The contributions of the three variable sets in explaining which households have non- driving adults are captured in the omnibus statistical tests listed in Table 7. All eight of the sociodemographic variables were important, but only one spatial variable, housing density, was found to be significant in describing the residential location of these households, controlling for their socioeconomic characteristics. Road infrastructure was not significantly different than zero. Table 7 Binary Logit Model of Presence in Household of Non- driving Adult Variable set Contribution of set Cumulative model Chi- square Degrees of freedom Chi- square Degrees of freedom Nagelkerke R2 Sociodemographic 2678.10 35 2678.10 35 .285 Spatial density 51.60 9 2729.70 44 .290 Road infrastructure ( not significant) The statistically significant influences of the sociodemographic variables are listed in Table 8. Income and household size display monotonic effects, and age highlights the expected elderly outcome. Non- driving adults are more likely to be found in Hispanic and Black households, and in households in the lowest education groups. Finally, households with children present are less likely to have non- driving adults, regardless of the ages of the children. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 48 K. Goulias, T. Golob, and S. Y. Yoon Table 8 Logit Model of Presence of Non- driving Adult – Sociodemographic Independent variable Significance Odds ratio Income ( base = unknown) 0.00 <$ 10,000 0.00 5.910 $ 10,000-$ 24,999 0.00 3.093 $ 25,000-$ 34,999 0.00 1.510 $ 35,000-$ 49,999 $ 50,000-$ 74,999 0.00 0.653 $ 75,000-$ 99,999 0.00 0.491 $ 100,000-$ 149,999 0.00 0.366 $ 150,000+ 0.00 0.250 household size ( base = 6 or more) 0.00 1 0.00 0.073 2 0.00 0.217 3 4 0.00 1.706 5 0.00 4.200 Average age of heads ( base = unknown) 0.00 18- 25 0.00 0.702 25.5- 35 0.00 0.691 35.5- 45 0.00 0.625 45.5- 55 55.5- 65 65.5- 75 0.03 1.189 75.5+ 0.00 2.998 Ethnicity ( base = unknown) 0.00 White 0.00 0.620 Hispanic 0.00 1.723 Black 0.00 1.482 Asian/ Pacific Islander 0.04 0.694 White & Hispanic 0.02 0.726 White & Asian Education ( base = unknown) 0.00 not high school graduate 0.00 1.862 high school graduate 0.00 1.269 Some college associates degree 0.01 0.766 bachelors degree 0.00 0.713 graduate degree 0.00 0.687 presence of children 0- 5 yrs. Old 0.00 0.262 presence of children 6- 12 yrs. Old 0.00 0.326 presence of children 13- 17 yrs. Old 0.00 0.369 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 49 K. Goulias, T. Golob, and S. Y. Yoon As shown in Table 9, households with non- driving adults are less likely to be located in low density residential areas ( e. g, the lowest quartile of residential density), and more likely to be located in the very highest density areas. There is no statistically significant relationship between accessibility or road infrastructure and the likelihood of the presence of non- driving adults. In other words, households with non- driving adults are most likely not choosing where they live to accommodate the travel behavior of their non- driving members. Consequently, in the travel behavior models that follow, the accessibility and infrastructure effects are not attributable to the contribution to travel behavior of non- driving adults. Table 9 Logit Model of Presence of Non- driving Adult – Spatial Density ( Tract) Ind. variable ( all bases = 50th % tile) Significance Odds ratio tract household density 0.00 < 10 % tile 0.03 0.828 10th % tile 0.00 0.752 20th % tile ( 0.06) ( 0.849) 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.00 1.608 6.1.2 Comparison with Block Group Model In Table 10, the contributions of sociodemographics, spatial variables measured at the tract level, and spatial variables measured at the block group level are compared in terms of their contribution to goodness of fit. The impact of the sociodemographic variables on the presence of non- driving adult in households is almost identical in the census tract model and the block group model of non- driving adults. Because the measurement of retail employee within a certain travel distance involves shortest distance, and the calculation of household density does not, using smaller spatial unit has different model implications. When we use a smaller spatial unit, it means that we consider a smaller “ neighborhood( s)” around home locations for household density computation, but it means closer approximation for measurements involving shortest travel distance. To see the influence of using smaller spatial units on each variable set, the FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 50 K. Goulias, T. Golob, and S. Y. Yoon contribution of household density and retail employee density are given separately in Table 10. Only household density has a significant impact on non- driving adults, and it contributes more to the model when it is measured using the larger spatial unit areas, census tracts in this case. Additional estimation details are also offered by Table 11 for spatial density. Table 10 Binomial Logit Models of Presence in Household of Non- driving Adult Model Variable set Contribution of set Cumulative model Chi- square Degrees of freedom Chi- square Degrees of freedom Nagelkerke R2 Census tract Sociodemographic 2678.10 35 2678.10 35 .285 Spatial density 51.60 9 2729.70 44 .290 Household density 51.60 9 Retail employee - - Road infrastructure ( not significant) Block group Sociodemographic 2678.69 35 2678.69 35 .285 Spatial density 37.54 9 2716.22 44 .289 Household density 37.54 9 Retail employee - - Road infrastructure ( not significant) Table 11 Logit Models of Presence of Non- driving Adult – Spatial Density Ind. Variable ( all bases = 50th % tile) Census tract Block group Significance Odds ratio Significance Odds ratio household density 0.00 0.00 < 10 % tile 0.03 0.828 0.03 0.835 10th % tile 0.00 0.752 ( 0.06) ( 0.850) 20th % tile ( 0.06) ( 0.849) 0.02 0.816 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.00 1.608 0.00 1.435 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 51 K. Goulias, T. Golob, and S. Y. Yoon 6.2 Transit Usage by Any Household Member Transit usage is defined as taking any local transit mode, including bus, rail, and light rail, but not including long distance bus trips. School bus trips are also included as household public transport trips. Of the 16,750 households with complete data ( 98.3% of the sample), 8.1% had a household member who made at least one trip by public transport ( transit); the highest concentration of these households being in the San Francisco Bay Area, where 14.4% of households in this sample were transit users. 6.2.1 Census Tract Model Compared to the previous model for households with non- driving adults, socioeconomic factors are less effective in explaining which households are transit users, but there are three significant spatial factors, and one road infrastructure variable is important as shown in Table 12. Table 12 Logit Model of Any Household Transit Use and Spatial Density at Tract Level Variable set Contribution of set Cumulative model Chi- square Degrees of freedom Chi- square Degrees of freedom Nagelkerke R2 Sociodemographic 1633.28 35 1633.28 35 .216 Spatial density 177.90 27 1811.19 62 .238 Road infrastructure 81.57 9 1892.76 71 .248 The estimated effects of the sociodemographic variables are reported on Table 13. Transit usage is a decreasing function of income ( where statistically insignificant categories are shown to complete the picture), and an increasing function of household size. Transit usage is generally a decreasing function of age of the household head( s), but usage is greatest for the second youngest group, and lowest for the second oldest group. Transit services for the elderly probably increase the likelihood of transit usage for households with the oldest household heads. Education is not an effective predictor of transit usage, and only one ethnicity category is important: black households are 1.6 times more likely to use transit. Regarding children, households with only young children are less likely to use transit, while those with older children are more likely to use transit. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 52 K. Goulias, T. Golob, and S. Y. Yoon Table 13 Logit Model of Any Household Transit Use – Sociodemographic Variables Independent variable Significance Odds ratio Income ( base = unknown) 0.00 <$ 10,000 0.00 2.175 $ 10,000-$ 24,999 0.00 1.381 $ 25,000-$ 34,999 0.03 1.198 $ 35,000-$ 49,999 $ 50,000-$ 74,999 0.00 0.810 $ 75,000-$ 99,999 ( 0.29) ( 0.909) $ 100,000-$ 149,999 ( 0.26) ( 0.886) $ 150,000+ 0.00 0.379 household size ( base = 6 or more) 0.00 1 0.00 0.376 2 0.00 0.491 3 4 0.00 1.416 5 0.00 1.737 Average age of heads ( base = unknown) 0.00 18- 25 0.05 1.267 25.5- 35 0.00 1.488 35.5- 45 0.00 1.255 45.5- 55 55.5- 65 65.5- 75 0.00 0.554 75.5+ 0.03 0.691 Ethnicity ( base = unknown) 0.09 White Hispanic Black 0.00 1.618 Asian/ Pacific Islander White & Hispanic White & Asian Education ( base = unknown) 0.41 not high school graduate high school graduate some college associates degree bachelors degree graduate degree presence of children 0- 5 yrs. old 0.01 0.775 presence of children 6- 12 yrs. Old 0.00 2.363 presence of children 13- 17 yrs. old 0.00 3.001 Spatially, as expected, transit- using households are concentrated in the densest 10% of residential areas, and also in the least dense 20% of areas, as shown in Table 14. But excluding areas in the highest 10% of housing density, households located in areas above median density are less likely to use transit. Census tracts with low density housing tend to be located in rural counties. While the presence of school age FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 53 K. Goulias, T. Golob, and S. Y. Yoon children in the household coupled with the inclusion of school bus trips as public transit trips may account for some of this effect, this result underscores the importance of rural public transport. Table 14 Logit Model of Any Household Transit Use – Spatial Density Ind. Variable ( all bases = 50th % tile) Significance Odds ratio tract household density 0.00 < 10 % tile ( 0.20) ( 1.215) 10th % tile 0.04 1.262 20th % tile 30th % tile 40th % tile 60th % tile 0.01 0.758 70th % tile ( 0.43) ( 0.919) 80th % tile 0.00 0.725 90th % tile 0.00 1.677 retail employees within 10 km 0.00 < 10 % tile 10th % tile 20th % tile 0.02 0.755 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.00 1.795 retail employees within 10 to 50km 0.00 < 10 % tile ( 0.19) ( 0.764) 10th % tile 0.00 0.594 20th % tile 0.01 0.664 30th % tile 40th % tile 60th % tile 0.02 1.381 70th % tile 80th % tile 90th % tile 0.00 2.140 Accessibility to retail services, particularly accessibility at the regional level ( 10 to 50 km), indicates lower transit usage for households located in low accessibility areas, and high transit usage for households located in the highest 10% of retail accessibility. This effect undoubtedly captures the urban core phenomenon. The influence of road infrastructure is complex, as shown in Table 15. Controlling for sociodemographic factors and spatial density, households that live in areas in the lower quartile of regional primary surface road coverage ( primary roads without limited access within 10 to 50 km of network distance) exhibit the highest transit usage, together with households in the 80th percentile. However, households above the 90th percentile have very low transit usage. Once again, the importance FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 54 K. Goulias, T. Golob, and S. Y. Yoon of rural public transport is picked up by the road infrastructure variable, even when controlling for housing and retail density. In tracts with both low housing density and lower levels of road infrastructure, the likelihood of transit usage is unusually high. Table 15 Logit Model of Any Household Transit Use – Road Infrastructure Variable ( Bases = 50th % tile) Significance Odds ratio primary roads w/ o limited access within 10 to 50 km 0.00 < 10 % tile 0.02 1.678 10th % tile 0.02 1.466 20th % tile 0.04 1.401 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 0.01 1.483 90th % tile 0.00 0.364 6.2.2 Comparison with Block Group Model As shown in Table 16, household density contributes slightly more to the model when it is measured at the census tract level, and the other spatial variable sets – retail employee density and road infrastructure - contribute more to the model when they are measured based on block groups. Especially, the road infrastructure in the block group model contributed almost twice as much as in the census tract model in terms of chi- square. Table 16 Logit Models of Any Household Transit Use Model Variable set Contribution of set Cumulative model Chi- square Degrees of freedom Chi- square Degrees of freedom Nagelkerke R2 Census tract Sociodemographic 1633.28 35 1633.28 35 .216 Spatial density 177.90 27 1811.19 62 .238 Household density 125.45 9 Retail employee 52.45 18 Road infrastructure 81.57 9 1892.76 71 .248 Block group Sociodemographic 1633.58 35 1633.58 35 .216 Spatial density 180.37 27 1813.95 62 .238 Household density 106.50 9 Retail employee 73.87 18 Road infrastructure 158.66 9 1972.60 71 .258 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 55 K. Goulias, T. Golob, and S. Y. Yoon The spatial density variables show similar impact patterns on household transit usage in the block group analysis, too. However, in the block group model, the concentration of transit usage in the highest density area is stronger and the concentration in 10th percentile of household density is not captured. The highest percentile of the block group retail employee density had higher impact in both buffers ( 0 to 10 km and 10 to 50 km). This can be a typical influence of MAUP. First, different sizes of unit area produce different statistics, household density in this case, and they reveal different patterns of influences. The patterns can have different impact in the models as the variable sets do in the Logit model of household transit use ( Table 17). Second, different levels of spatial aggregation lead to different levels of approximation of the explanatory variables. From the comparison between the two models of household transit use, it appears that a better approximation of an explanatory variable by going one level of disaggregation down ( from tract to block group) improves the contribution of the independent variables by explaining variation in the dependent variable. The influence pattern of road infrastructure of the block group model is similar to that of the census tract model, but in addition to primary roads without limited access within 10 to 50km, which was the only road infrastructure variable set significant in the census tract model of household transit usage, local roads variables were found to be significant in the block group model ( Table 18). In the block group model, the importance of rural public transportation is also picked up, and the likelihood of transit usage is low in the households which belong to the highest 10% road network areas. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 56 K. Goulias, T. Golob, and S. Y. Yoon Table 17 Logit Models of Any Household Transit Use – Spatial Density Ind. Variable ( all bases = 50th % tile) Census tract Block group Significance Odds ratio Significance Odds ratio household density 0.00 0.00 < 10 % tile ( 0.20) ( 1.215) 10th % tile 0.04 1.262 ( 0.16) ( 1.157) 20th % tile 30th % tile 40th % tile 60th % tile 0.01 0.758 70th % tile ( 0.43) ( 0.919) 0.01 0.756 80th % tile 0.00 0.725 0.05 0.810 90th % tile 0.00 1.677 0.00 1.627 retail employees within 10 km 0.00 0.00 < 10 % tile 10th % tile 20th % tile 0.02 0.755 ( 0.08) ( 0.806) 30th % tile ( 0.09) ( 0.821) 40th % tile 60th % tile 0.02 0.752 70th % tile 80th % tile 90th % tile 0.00 1.795 0.00 2.218 retail employees within 10 to 50km 0.00 0.00 < 10 % tile ( 0.19) ( 0.764) 10th % tile 0.00 0.594 0.01 0.700 20th % tile 0.01 0.664 0.00 0.668 30th % tile 40th % tile 60th % tile 0.02 1.381 70th % tile 80th % tile 0.02 1.389 90th % tile 0.00 2.140 0.00 3.294 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 57 K. Goulias, T. Golob, and S. Y. Yoon Table 18 Logit Models of Any Household Transit Use – Road Infrastructure Ind. Variable ( all bases = 50th % tile) Census tract Block group Significance Odds ratio Significance Odds ratio primary roads w/ o limited access within 10 to 50 km 0.00 0.00 < 10 % tile 0.02 1.678 0.05 1.249 10th % tile 0.02 1.466 20th % tile 0.04 1.401 30th % tile 40th % tile 60th % tile ( 0.09) ( 1.180) 70th % tile 0.00 1.314 80th % tile 0.01 1.483 90th % tile 0.00 0.364 0.00 0.505 Local roads within 10 km 0.03 < 10 % tile 10th % tile 20th % tile 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.02 0.681 Local roads within 10 to 50 km 0.00 < 10 % tile 0.03 1.389 10th % tile 20th % tile 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.00 0.400 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 58 K. Goulias, T. Golob, and S. Y. Yoon 6.3 Transit Usage by an Adult Driver in the Household Analyzing transit trips made by any household member can be difficult to interpret, as children and non- driving adults may be skewing the results for some households but not others. The next model describes transit usage by adult drivers, being those adults who were either recorded as having a driver's license, or else were observed to have driven at least once. Only 2.7% of the 14,160 households with adult drivers and complete data have an adult driver that makes at least one transit trip. 6.3.1 Census Tract Model As expected, it is much more difficult to predict which households these are, based on sociodemographic factors, as seen by comparing the goodness- of- fit log- likelihood- ratio model Chi- square statistics and the pseudo- R2 indices in Table 19. However, spatial density is relatively more important in the case of adult drivers, and the same road infrastructure variable is also significant. Table 19 Logit Model of Household Transit Use by Adult Drivers Variable set Contribution of set Cumulative model Chi- square Degrees of freedom Chi- square Degrees of freedom Nagelkerke R2 Sociodemographic 216.32 35 216.32 35 .068 Spatial density 282.90 27 499.22 62 .155 Road infrastructure 64.76 9 563.98 71 .175 The sociodemographic predictors of transit usage by adult drivers are shown in Table 20. Such usage is concentrated in low income households, larger households, households in the middle age groups ( 35 to 55), black households, and more highly educated households. This latter effect probably captures central business district employment. Households less likely to have adult driver transit usage are high and middle income households, small households, households with heads in the 65- 75 year range, lower educated households, and households with children. Tables 21 and 22 show that the effects of rural public transport ( tracts with low density housing and road infrastructure) disappear when the focus is restricted to adult drivers. Still, controlling for sociodemographic factors, households that live in areas with the highest FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 59 K. Goulias, T. Golob, and S. Y. Yoon residential and retail density are the heaviest transit users. The phenomenon of low relative transit usage households in the 90th percentile of regional primary surface road coverage still prevails, as seen in Table 22. Households in the 90th percentile of regional primary arterial coverage are concentrated in Orange, Los Angeles, and San Mateo, and Alameda Counties, but there are also such households located in San Bernardino, Santa Clara, Riverside and Ventura Counties. An abundance of primary arterials appears to correlate with fewer household transit trips in these areas. 6.3.2 Comparison with Block Group Model As shown in Table 23, we can see the influence of using smaller unit areas in this comparison, too. Household density contributes more to the model when it is measured at the census tract level, and the other spatial variables contribute more to the model when they are measured at the block group level. Table 24 shows the likelihood of transit usage by adult drivers is relatively low among households in the 90th percentile of primary and local roads coverage as shown in the census tract model. However, the local roads variable set in the block group model still show the effect of rural public transport usage by adults drivers and the 70th percentile of primary road infrastructure had positive impact in the block group model, which couldn’t be seen in the census tract model. Table 25 shows the likelihood of transit usage by adult drivers was found to be the highest in the households in the 90th percentile of spatial density as it was in the census tract model. High transit usage in the 40th percentile of household density was marginally significant, which was not found in the census tract model. The impact of the highest deciles of retail employee density was higher and also clearer in the block group model. FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 60 K. Goulias, T. Golob, and S. Y. Yoon Table 20 Logit Model of Household Transit Use by Adult Drivers – Sociodemographic Independent variable Significance Odds ratio Income ( base = unknown) 0.00 <$ 10,000 0.00 3.144 $ 10,000-$ 24,999 $ 25,000-$ 34,999 $ 35,000-$ 49,999 $ 50,000-$ 74,999 0.00 0.614 $ 75,000-$ 99,999 $ 100,000-$ 149,999 $ 150,000+ 0.00 0.454 household size ( base = 6 or more) 0.00 1 0.00 0.413 2 0.00 0.446 3 0.00 0.640 4 0.02 1.379 5 0.00 1.879 Average age of heads ( base = unknown) 0.01 18- 25 25.5- 35 35.5- 45 0.01 1.418 45.5- 55 0.00 1.419 55.5- 65 65.5- 75 0.00 0.487 75.5+ Ethnicity ( base = unknown) 0.03 White Hispanic Black 0.01 1.749 Asian/ Pacific Islander White & Hispanic White & Asian Education ( base = unknown) 0.00 not high school graduate 0.03 0.610 high school graduate some college associates degree bachelors degree 0.01 1.465 graduate degree 0.00 1.867 presence of children 0- 5 yrs. old 0.00 0.382 presence of children 6- 12 yrs. Old 0.00 0.425 presence of children 13- 17 yrs. old 0.01 0.589 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 61 K. Goulias, T. Golob, and S. Y. Yoon Table 21 Logit Model of Household Transit Use by Adult Drivers – Spatial Density Ind. variable ( all bases = 50th % tile) Significance Odds ratio tract household density 0.00 < 10 % tile 10th % tile 20th % tile 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.00 2.335 retail employees within 10 km 0.00 < 10 % tile 10th % tile 20th % tile 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 0.02 1.622 90th % tile 0.00 2.148 retail employees within 10 to 50 km 0.06 < 10 % tile 10th % tile 20th % tile 0.03 0.499 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.00 2.644 Table 22 Logit Model of Household Transit Use by Adult Drivers – Infrastructure Variable ( base = 50th % tile) Significance Odds ratio primary roads w/ o limited access within 10 to 50 km 0.00 < 10 % tile 10th % tile 20th % tile 30th % tile 40th % tile 60th % tile 0.02 0.522 70th % tile 80th % tile 90th % tile 0.01 0.418 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 62 K. Goulias, T. Golob, and S. Y. Yoon Table 23 Logit Models of Household Transit Use by Adult Drivers Model Variable set Contribution of set Cumulative model Chi- square Degrees of freedom Chi- square Degrees of freedom Nagelkerke R2 Census Tract Sociodemographic 216.32 35 216.32 35 .068 Spatial density 282.90 27 499.22 62 .155 Household density 205.39 9 Retail employee 77.51 18 Road infrastructure 64.76 9 563.98 71 .175 Block Group Sociodemographic 216.34 35 216.34 35 .068 Spatial density 297.52 27 513.86 62 .159 Household density 180.12 9 Retail employee 117.40 18 Road infrastructure 116.93 18 630.76 80 .195 Table 24 Logit Models of Household Transit Use by Adult Drivers – Infrastructure Ind. Variable ( all bases = 50th % tile) Census tract Block group Significance Odds ratio Significance Odds ratio primary roads w/ o limited access within 10 to 50 km 0.00 0.00 < 10 % tile 10th % tile ( 0.07) ( 0.701) 20th % tile 30th % tile 40th % tile 60th % tile 0.02 0.522 ( 0.10) ( 1.317) 70th % tile 0.00 1.905 80th % tile 90th % tile 0.01 0.418 0.00 0.367 Local roads within 10 to 50 km 0.01 < 10 % tile 0.01 2.272 10th % tile 20th % tile 30th % tile 40th % tile 60th % tile 70th % tile 80th % tile 90th % tile 0.00 0.295 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 63 K. Goulias, T. Golob, and S. Y. Yoon Table 25 Logit Models of Household Transit Use by Adult Drivers – Spatial Density Ind. Variable ( all bases = 50th % tile) Census tract Block group Significance Odds ratio Significance Odds ratio household density 0.00 0.00 < 10 % tile 10th % tile 20th % tile 0.02 0.514 30th % tile 40th % tile 0.05 1.397 60th % tile 70th % tile 80th % tile 90th % tile 0.00 2.335 0.00 1.981 retail employees within 10 km 0.00 0.00 < 10 % tile 10th % tile 20th % tile 0.01 0.443 30th % tile 0.02 0.533 40th % tile 60th % tile 70th % tile 80th % tile 0.02 1.622 ( 0.08) ( 1.395) 90th % tile 0.00 2.148 0.00 3.061 retail employees within 10 to 50km 0.06 0.00 < 10 % tile 10th % tile 20th % tile 0.03 0.499 ( 0.09) ( 0.649) 30th % tile 0.04 0.593 40th % tile 60th % tile ( 0.07) ( 0.641) 70th % tile 80th % tile 0.00 2.400 90th % tile 0.00 2.644 0.00 5.484 FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 64 K. Goulias, T. Golob, and S. Y. Yoon 6.4 Nonmotorized Travel by Any Household Member Of our 16,750 households with complete data ( 98.3% of the sample), 14.2% had a household member that made at least one trip walking or by bicycle. As in the case of transit, the highest concentration of these households was in the San Francisco Bay Area, where 25.9% of the households in this survey recorded a nonmotorized trip segment, followed by Santa Barbara County, with 19.2% of households. 6.4.1 Census Tract Model Compared to transit- using households, it is more difficult to explain households that generate nonmotorized travel ( Table 26). However, spatial factors are relatively more important in nonmotorized travel demand. Table 26 Logit Model of Any Household Nonmotorized Travel Variable set Contribution of set Cumulative model Chi- square Degrees of freedom Chi- square Degrees of freedom Nagelkerke R2 Sociodemographic 1065.65 35 1065.65 35 .116 Spatial density 373.08 27 1438.73 62 .147 Road infrastructure 104.31 18 1543.04 80 .158 The sociodemographic predictors of household nonmotorized travel are listed in Table 27. As expected, the presence of children older than 6 increases the likelihood of a household making a nonmotorized trip, while the presence of very young children decreases that likelihood. Lower income and the youngest households are more likely to make nonmotorized trips, but so are the most highly educated households. With regard to influences of the built environment on nonmotorized travel ( Tables 28 and 29), the “ rural” effect is somewhat different for nonmotorized trips than for public transport trips. Here low housing density produces a lower propensity for nonmotorized trips, confirming that extreme distances among activities inhibit the use of slower modes. It is possible that for some households rural transit trips is taking the place of rural nonmotorized trips. In terms of road infrastructure, Table 29 shows that the lower percentiles have much higher propensity for nonmotorized trips, as is the case for transit. Higher levels of road infrastructure correspond to lower levels of nonmotorized trips. Both of these effects are perhaps related to using nonmotorized trips as a form of recreation, as it is pleasant to FINAL REPORT – PATH TASK ORDERS 5110 & 6110 - October 2008 65 K. Goulias, T. Golob, and S. Y. Yoon walk or bike in less developed, low traffic areas, while it is both unpleasant and dangerous to walk or bike in highly developed, high traffic areas. 6.4.2 Comparison with Block Group Model The contribution of household density is larger in the census tract model, and the contributions of the other variable sets are larger in the block group model for household nonmotorized travel, too ( Table 30 and Table 31). As shown in Table 32, the block group models also show that low household and retail employee density produces a lower propensity for nonmotorized trips, but the impact of retail employee d |
|
|
| B |
| C |
| I |
| S |
|
|