Data Overview¶
The UrbanSim zone-level model template operates at the user provided zonal level of geography. The core tables in this model template are:
Zones:
Capacities (optional but recommended)
Each of these core datasets can be constructed as part of a consulting contract between UrbanSim and the zone-level UrbanCanvas Modeler subscriber. Subscribers are encouraged to prepare their spatial and tabular regional zonal data in advance.
The zone-level model simulates the addition and/or movement of individual household and job records in the region and the construction and demolition of built space. The households table contains one record for each household in the region that links to a persons table that contains a record for each person in a household. Employment is represented with the jobs table which contains one record for each job in the region. Residential built space is represented with the residential units table which contains one record for each residential unit in the region. Non-residential space is represented by attributes (one column per space type) in the zones table. Each table has a unique identifier field, a zone ID field, as well as fields representing agent characteristics.
Each new agent introduced into the simulation is allocated to a particular zone in the region using multinomial logit-based location choice models, estimated off of local data. The zones table, which contains one record for each zone in the region, represents the base geography of the zone template. When allocating agents to zones, rent and price is often a key explanatory variable, and price models can be estimated using rent or price datasets that the user has uploaded.
The primary data schema difference between the zone-level model and the parcel-level model is that the tables representing agents (e.g. households and jobs) in the zone model are associated with zones rather than individual buildings on parcels. In a typical parcel-level model, space is represented by buildings and parcels, and each agent has a building_id. In a zone-level model, space is represented by zones (and attributes of the zones table), and each agent has a zone_id.
The zone and block-level models operate in a similar fashion and have similar data schema requirements. The main difference between zone and block models is the requirement of zone model users to provide the zonal base geography and the synthetic agent datasets when public data are unavailable.
Because the simulation operates at the zone-level, model output can be summarized at any geography at the provided zone level or coarser (e.g. For users in the United States: Census tracts, counties etc.).
UrbanCanvas: Core Base Year Data¶
Both the zonal geometry and building types table are required to initialize the UrbanCanvas user interface for new subscribers.
Zonal geometry¶
UrbanCanvas Modeler requires a zipped (.zip) shapefile of regional zonal geometries with the same zone identifier that is used in the simulation zone attributes table. This shapefile will be used as the base data geometry in the UrbanCanvas user interface and database. See Table 1 for the data schema. The zone identifier (zone_id) must be a unique integer. The zone_id is used to link data created and uploaded into UrbanCanvas (e.g. development projects, constraints, etc.) with the core UrbanSim simulation data. This identifier will be displayed in the user interface for reference in the map and tables. Upon the creation of your UrbanCanvas user account, you must contact us here and supply your initial UrbanCanvas zonal geometries. After the geometries have passed UrbanSim’s quality and topology validation, UrbanSim will populate your UrbanCanvas account with the supplied geometries.
Note
If you are currently working on developing your own zonal geometries, consider hexagonal or square grid polygons of uniform area as a possible method of representing units of land. These polygons can also be split to match jurisdictional or coarse summary geography boundaries so that they cleanly nest within coarser geographic units.
Discontinuous multi-part zonal geometries are discouraged. Zonal geometry should support a clean accounting of land area when summarizing results at the zonal level or at coarser summary geographies. This means that zonal geometries should not be multi-part, overlapping, stacked, contain holes, or represent slivers. Zones that represent land that is known to be undevelopable can and should be included as long as they have attributes that clearly discern their land use type and are marked as undevelopable. These zones can provide useful information during model specification and for reference in the user interface. In cases where zones are large and can overlap multiple summary geography boundaries, it may be worth considering slicing zones along summary geography boundaries when zones are inconsistent with summary geographies. Slicing zones along key boundaries ensures that zones will nest cleanly within coarser level geographies. Clipping zonal boundaries can also be considered if substantial land area is underwater, such as a coastal zones that can extend far out into a bay or ocean. The simulation is interested in the land area associated with zones, not the water area, so unless the ocean is represented in an undevelopable layer, the underwater portion of zones should be clipped away so as to avoid the model perceiving excess capacity. In summary, when zones have geometry problems, this can lead to undesired simulation artifacts or negatively affect the behavior of imputation and visualization operations.
Note
Upon the initialization of your zonal geometry in UrbanCanvas Modeler, if any changes or updates to the geometries are required after the initial upload you can change it at anytime if you do not have any existing development projects, constraints, or adjustments in your account. If these data exist in your account and you would like to change the zonal geometry and or the building types table set in the initialization collection please contact us here for assistance.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
zone_id |
Integer |
Yes |
Unique zone identifier, same as that used |
Building types¶
UrbanCanvas Modeler requires a table of regional building types that exist in the simulation. This table is also used to initialize the building type categories available when creating new development projects and constraints from within the UrbanCanvas Modeler user interface. The building types table contains the typology of building types to use in the model. See Table 2 for details on the data schema. To denote a building type as a mixed use, set both is_residential and is_non_residential to True. Optionally, you may set a building type as neither residential nor non-residential and want to track development projects separately by setting is_other to True, such as hotel and resort units, you may do so however note that building types of other type are not used in the model. Upon the creation of your UrbanCanvas user account, you must contact us here and supply your initial UrbanCanvas building types table. After the building types table has passed UrbanSim’s quality assessment, UrbanSim will populate your UrbanCanvas account with the supplied building types table.
Note
Upon the initialization of your supplied building types table in UrbanCanvas Modeler, if any changes or updates to the table are required after the initial upload you can change it at anytime if you do not have any existing development projects, constraints, or adjustments in your account. If these data exist in your account and you would like to change the zone geometry and or the building types table set in the initialization collection please contact us here for assistance.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
building_type_id |
Integer |
Yes |
Unique identifier for the building type. |
name |
String |
Yes |
Name of the building type. Name will be displayed in the |
is_residential |
Boolean |
Yes |
True if building type can have a residential |
is_non_residential |
Boolean |
Yes |
True if building type can have a |
description |
String |
No |
Description of the building type |
is_other |
Boolean |
No |
True if building type is not a residential or a |
Simulation: Core Base Year Data¶
Zones¶
Zonal attributes¶
The zonal attribute table is the core geographic table in the zone model, and it contains one record for each zone. Each agent in the simulation is associated with a particular zone. The zones table is to the zone model what the parcel table is to the parcel model: a representation of land, a spatial look-up for summary geographies, the link to network data for accessibility queries, and the link to development constraint information. The zone_ids in the zonal table must be unique integers. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional by default.
See Table 3 for the data schema. Note in particular the residential_unit_capacity and employment_capacity fields. Users should provide capacity data using one of these options.
Type |
Units |
Reference link |
---|---|---|
Development capacities |
Job and residential unit capacity |
|
Development constraints |
Job and residential unit capacity |
|
Regional zoning |
Max floor-area ratio (FAR) and dwelling units per acre or hectare |
Development capacities can be directly represented as job and residential unit development capacities and uploaded to your base data collection when following the zone capacity data schema. This is the most direct method of providing base year capacities and can either be applied as a stand-alone table of capacities (recommended) or can be appended as columns to the existing zonal attribute table, but cannot be represented as both. Development capacities can also be represented by regional development constraint data using employment and residential unit capacities. Data can be bulk uploaded upon request here. In addition, development capacities can be represented by regional zoning data that follows the zone model zoning data schema. Once uploaded, these data can be converted to development constraints using the zoning to constraint conversion tool and will be represented as max development capacities. This is designed for use in scenario development but can be used to update base year capacities.
Rent and or price fields are not required to be populated on the zonal attribute table, instead it is suggested residential rent and prices are uploaded as separate tables. Contact us here for common sources of proprietary rental and price data and consult the rent and price data uploaders available in the user interface.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
zone_id |
Integer |
Unique zonal identifier. Must correspond to the zonal geometry zone_id. |
|
square_meters_land |
Float |
Yes |
Zonal land area in square meters (excludes water). |
residential_unit_capacity |
Integer |
No |
Total residential unit capacity of zone |
employment_capacity |
Integer |
No |
Total employment capacity of zone as implied |
location_id |
Integer |
No |
If using an optional non-residential |
Built Space¶
Residential built space is represented with the residential units table which contains one record for each residential unit in the region. Non-residential space in the default zone model is represented as a construct of zoned zonal employment capacities supplied by users. However, non-residential space can be explicitly represented with an optional non-residential space table. See the non-residential space section for more information on how non-residential space is derived.
Residential Units¶
The residential units table contains disaggregate residential unit data for mixed-use and or residential buildings the region, with each row pertaining to one residential unit. Each unit must be tied to a specific zone in the zones table using zone_ids. Optionally, you may also specify unit_ids in the households table to tie households to residential units, however this is discouraged for use with the default zone model. See Table 5 for the residential units table data schema.
Rent and or price fields are not required to be populated on the residential units table, instead it is suggested residential rent and prices are uploaded as separate tables. Contact us here for common sources of proprietary rental and price data and consult the rent and price data uploaders available in the user interface. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
unit_id |
Integer |
Yes |
Unique unit record identifier |
zone_id |
Integer |
Yes |
Unique zonal identifier that corresponds to the |
building_type_id |
Integer |
Yes |
ID of the building’s structure type that |
year_built |
Integer |
No |
Year of construction. |
tenure |
Integer |
No |
Typically tenure is coded as (1 : own, 2 : rent), however |
Non-residential Space (Optional)¶
Non-residential space in the default zone model is represented as a construct of zone employment capacities typically represented as zoned employment capacities. These capacities can be adjusted with development projects and constraints. However, for non-residential space to be directly represented with multiple building types this can be achieved by using a non-residential space table that can be derived from your preferred source of non-residential space data, such as in the Unites States: the county assessor or data vendors such as Costar. To learn more, contact us here. The non-residential space table gives quantities of non-residential area by zone with one row per zone and one column for each non-residential building type. Columns names must end in non-residential building type IDs that correspond to the building_type_ids in the building types table. See Table 5 below for the fields in the non-residential space table. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
zone_id |
Integer |
Yes |
Unique zonal identifier that corresponds to the |
mean_year_built_btype_[BUILDING_TYPE_ID] |
Integer |
Yes |
Mean year of construction for non-residential |
non_res_sqm_btype_[BUILDING_TYPE_ID] |
Integer |
Yes |
Total sqm of non-residential |
Note
If you are using a non-residential space table, you must also have an area per job table. Area units and column names can be represented as either square meters (sqm) or square feet (sqft).
Area per job (required only if using Non-Residential Space)¶
The area per job table contains the non-residential square meters or square footage each job will occupy in each building type, or optionally in each combination of building type and subarea as denoted with the use of a location_id that corresponds to the location_id in the zone table. This information is used in the calculation of both the number of job spaces of each building type in each zone and non-residential vacancy rates. This table must have a building_type_id that corresponds to the IDs in the building types table. The area per job table is optional and only required when using the optional non-residential space table.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
building_type_id |
Integer |
Yes |
ID of non-residential building |
area_per_job |
Float |
Yes |
The unit of space required per job in square |
location_id |
Integer |
No |
ID of geography that corresponds to a location ID |
Note
If you are using a non-residential space table, you must also have an area per job table. Area units and column names can be represented as either square meters (sqm) or square feet (sqft).
Agents¶
Households
Households¶
The household table contains disaggregate household data for the region, with each row pertaining to one synthesized household. This can be generated by a population synthesizer such as Synthpop [10]. During simulation, new households will be introduced into this table to match household control totals provided by the user. The zone_id of households in simulation will be populated by UrbanSim’s Household Location Choice Model. If the region has conducted a travel survey and if this survey contains information on recent-mover status of households, this can be used in model estimation to supplement the full households table. In the absence of observed recent-movers from a travel survey, it is recommended that the population be synthesized with recent-mover status as a control variable. Optionally, a persons table can be used as well. See Table 7 for the household table data schema. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default. Any of the household characteristics in the table below can be utilized in model adjustments and household control totals.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
household_id |
Integer |
Yes |
Unique household record identifier |
zone_id |
Integer |
Yes |
Unique zonal identifier that corresponds to the |
persons |
Integer |
Yes |
Number of persons in the household (For US |
income |
Integer |
Yes |
Annual household income in 2013 dollars (For |
tenure |
Integer |
Yes |
Typically tenure is coded as (1 : own, 2 : rent), however |
unit_id |
Integer |
No |
Unit ID of the residential unit the household is |
serialno |
Integer or String |
No |
For US data only, Public Use Microdata Sample (PUMS) serial |
cars |
Integer |
No |
Number of vehicles in the household (For US |
race_of_head |
Integer |
No |
Race code of head of household (For |
age_of_head |
Integer |
No |
Age of head of household (For US data, see [3]) |
workers |
Integer |
No |
Number of workers (employed persons) in the |
children |
Integer |
No |
Number of persons under age 18 in household |
recent_mover |
Boolean |
No |
1 if household moved within last 5 |
Persons¶
The persons table contains disaggregate persons data for the region, with each row pertaining to one synthesized person. The persons table should have a household_id attribute corresponding to the household table that indicates which household each person is in. The UrbanSim transition model has the ability to keep the households and persons table in sync as it seeks to match values in the annual household control totals table. See Table 7 for the persons table data schema. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
person_id |
Integer |
Yes |
Unique person record identifier |
household_id |
Integer |
Yes |
Unique household record identifier. Must |
member_id |
Integer |
No |
For US data, PUMS person number, see PUMS SPORDER variable) |
age |
Integer |
No |
Age of person in years (For US data, see PUMS AGEP variable) |
income |
Integer |
No |
Person annual income in 2013 dollars (For US data, |
education |
Integer |
No |
Person educational attainment (For US data, |
race_id |
Integer |
No |
Race code of person (For US data, |
hh_relationship |
Integer |
No |
Person relationship to household (For |
sex |
Integer |
No |
Person gender (1 : male, 2 : female) (For US data, |
student_status |
Boolean |
No |
Person student status (For US data, see |
worker_status |
Boolean |
No |
Person worker status (For US data, see |
hours |
Integer |
No |
Person hours worked per week in past 12 months (For |
work_at_home |
Boolean |
No |
Person worker status, where 1 if person works |
Employment
Employment can be represented in one of two ways, as 1) disaggregated jobs or 2) as establishments. One or the other can be used to represent employment in the region but both are not required. Establishment data can be more difficult to collect and clean, but it facilitates a more behavioral way to represent employment and allows for the use of future firmographic models.
Jobs¶
The jobs table contains one record for each job in the region. A job is tied to the zone table using the zone_id. Multiple job records can exist in the same zone. The sector_id references the employment sector a job pertains to and is tied to the sector IDs in the job sector IDs table sector_id column. The sector_id and any other optional columns can be utilized in model adjustments and employment control totals. The home_based field is optional, and indicates whether the job is based in a residential unit. See Table 8 for a description of the jobs table columns. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
job_id |
Integer |
Yes |
Unique job record identifier |
zone_id |
Integer |
Yes |
Unique zonal identifier that corresponds to the |
sector_id |
Integer |
Yes |
Employment sector ID. Must correspond to the sector_ids |
home_based |
Boolean |
No |
Where 1 denotes the job must locate in a |
occupation |
Integer |
No |
Identifier of occupation category |
recent_mover |
Boolean |
No |
1 if job moved within last 5 years, else 0 |
establishment_id |
Integer |
No |
If jobs were generated from establishments |
Establishments (Optional)¶
The establishment table contains one record for each business establishment or firm in the region. The employees attribute indicates how many jobs are in each establishment. An establishment is tied to the zones table using the zone_id. Multiple establishment records can exist in the same zone. The sector_id references the employment sector an establishment pertains to. See Table 9 for a description of the establishment table columns. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
establishment_id |
Integer |
Yes |
Unique establishment record identifier. |
employees |
Integer |
Yes |
Total number of employees in establishment |
sector_id |
Integer |
Yes |
Employment sector ID. Must correspond to the sector_ids |
zone_id |
Integer |
Yes |
Unique zonal identifier that corresponds to the |
home_based |
Boolean |
No |
Where 1 denotes the establishment must locate in a |
recent_mover |
Boolean |
No |
1 if establishment moved within last 5 years, else 0 |
Job Sector IDs¶
The job sector ID table denotes all the unique employment sector IDs that are utilized in either the jobs or establishment tables. This table must have the sector_id and corresponding sector name as columns. This table will be used in UrbanCanvas as a reference in the user interface and to generate indicators. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
sector_id |
Integer |
Yes |
Unique employment sector ID. Represents the corresponding |
name |
String |
Yes |
Name of employment sector. This name will be used as a |
Capacities (Optional but recommended)¶
The capacity table holds base year zone level residential unit and employment capacities as implied by zoning and other development constraints. These capacities represent the upper limit of what can be built in each zone when new residential and non-residential supply is added to a zone. This means that the residential location choice models can place as many units on a zone as there is capacity, but cannot exceed the capacity, and the employment location choice models can place as many jobs in a zone as there is capacity, but cannot exceed the capacity. If a particular zone has a high probability of attracting units or jobs but capacity has been reached, those units or jobs will be placed in other zones where remaining capacity exists.
In the default zone model, capacities can be represented by two general building type segments: 1) residential and 2) non-residential building types in units of residential units and number of jobs, respectively. Zones that are undevelopable or that have some portion undevelopable can be represented with 0 capacity or a scaled down capacity value based on undevelopable area, respectively. You may also expand these two default segments into building type specific capacities or other more detailed segments as needed. See the capacity table data schema below for the data schemas for specifying capacity using general building types or specific building types. If using specific building types, column names must end in building type IDs that correspond to the building_type_ids in the building types table. Contact us here for more information. An unlimited number of optional columns may be added to this table for use, however the model will interpret the columns listed as optional below by default.
The capacity table is the most direct and recommended method of providing base year capacities. If a stand-alone capacity table is being used, you must remove any capacity columns on the zone attribute table. Once the base year capacities are set, future capacity can be modified when crafting scenarios using regional development constraint data. See this table for a summary of ways to specify capacity.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
zone_id |
Integer |
Yes |
Unique zonal identifier that corresponds to the |
residential_unit_capacity |
Integer |
Yes |
Total residential unit capacity of zone |
employment_capacity |
Integer |
Yes |
Total employment capacity of zone as implied |
res_unit_capacity_btype_[BUILDING_TYPE_ID] |
Integer |
Yes |
Total residential unit capacity of zone for structure type as implied |
emp_capacity_btype_[BUILDING_TYPE_ID] |
Integer |
Yes |
Total employment capacity of zone for structure type as implied |
Rents and Prices¶
The market price of built space is a key input to UrbanSim and is also a modeled output of UrbanSim. Prices play a role in the demand-side models as an explanatory variable in household and employment location choices. They also play a role in the supply-side model, as an explanatory variable in the real estate development location choice model. Models of real estate rents and prices are estimated so that rents and prices are updated endogenously in the model system. If the model system has a price equilibration component, prices in the model are further updated based on a demand-price equilibration routine.
Rent and price data can be appended as columns to the zones and/or residential units table, or supplied as separate uploaded datasets. Price observations need not be comprehensive for all built space, a sample of prices can be used to estimate a model that can then be applied to predict prices more broadly. It is suggested you upload separate rent and price data using the rent and price data uploader following the data schema for each building and transaction types listed in the following sections below. Once uploaded, you may manage each dataset separately from the rest of your base data. You can choose to use specific data in UrbanSim or simply choose to upload and explore these data on the map.
Residential Rents¶
Observed data on the rental price of residential space provides a price signal for the renter side of the residential real estate market. Based on the supplied data, regression models of residential rents are estimated and the predicted values are then input to both the supply model and renter segments of the household location choice model. It is not required to connect these data directly to zones however you may use the optional identification columns below to tie data directly to your other base data tables or use x (longitude) and y (latitude) columns to tie directly to zone locations. Coordinates must be in the World Geodetic System 1984 (WGS84) coordinate system. When coordinates are present in the data, the data will automatically be available to view on the map as a point layer.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
year |
Integer |
Yes |
Year of the transaction |
rent |
Float |
Yes |
Predicted or observed rent per unit per month |
x |
Float |
No |
Longitude coordinate of transaction address or centroid. |
y |
Float |
No |
Latitude coordinate of transaction address or centroid. |
unit_id |
Integer |
No |
Unit ID of the transaction. Corresponds to the residential unit table unit_id column. |
building_type_id |
Integer |
No |
ID of mixed-use or residential building |
zone_id |
Integer |
No |
Zone ID of the transaction. Corresponds to the |
residential_sqft |
Integer |
No |
Amount of residential square footage |
residential_sqm |
Integer |
No |
Amount of residential square meters |
year_built |
Integer |
No |
Year of building construction |
bathrooms |
Float |
No |
Number of bathrooms |
bedrooms |
Float |
No |
Number of bedrooms |
Note
Only one, either a residential_sqft or a residential_sqm column can be supplied, both columns are not required.
Residential Prices¶
Observed data on the sale price of residential space provides a price signal for the ownership side of the residential real estate market. Based on the supplied data, regression models of residential sale prices are estimated and the predicted values are then input to both the supply model and owner segments of the household location choice model. It is not required to connect these data directly to zones however you may use the optional identification columns below to tie data directly to your other base data tables or use x (longitude) and y (latitude) columns to tie directly to zone locations. Coordinates must be in the World Geodetic System 1984 (WGS84) coordinate system. When coordinates are present in the data, the data will automatically be available to view on the map as a point layer.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
year |
Integer |
Yes |
Year of the transaction |
price |
Float |
Yes |
Predicted or observed price per unit |
x |
Float |
No |
Longitude coordinate of transaction address or centroid. |
y |
Float |
No |
Latitude coordinate of transaction address or centroid. |
building_type_id |
Integer |
No |
ID of mixed-use or residential building |
zone_id |
Integer |
No |
Zone ID of the transaction. Corresponds to the |
residential_sqft |
Integer |
No |
Amount of residential square footage |
residential_sqm |
Integer |
No |
Amount of residential square meters |
year_built |
Integer |
No |
Year of building construction |
bathrooms |
Float |
No |
Number of bathrooms |
bedrooms |
Float |
No |
Number of bedrooms |
Note
Only one, either a residential_sqft or a residential_sqm column can be supplied, both columns are not required.
Non-Residential Rents¶
Observed data on the rental price of non-residential space provides a price signal for the non-residential real estate market. Based on the supplied data, regression models of non-residential rents are estimated and the predicted values are then input to both the supply model and employment location choice model. It is not required to connect these data directly to zones however you may use the optional identification columns below to tie data directly to your other base data tables or use x (longitude) and y (latitude) columns to tie directly to zone locations. Coordinates must be in the World Geodetic System 1984 (WGS84) coordinate system. When coordinates are present in the data, the data will automatically be available to view on the map as a point layer.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
year |
Integer |
Yes |
Year of the transaction |
rent |
Float |
Yes |
Predicted or observed rent per square foot per year |
x |
Float |
No |
Longitude coordinate of transaction address or centroid. |
y |
Float |
No |
Latitude coordinate of transaction address or centroid. |
building_type_id |
Integer |
No |
ID of mixed-use or non-residential building |
zone_id |
Integer |
No |
Zone ID of the transaction. Corresponds to the |
non_residential_sqft |
Integer |
No |
Amount of non-residential square footage |
residential_sqm |
Integer |
No |
Amount of residential square meters |
year_built |
Integer |
No |
Year of building construction |
Note
Only one, either a residential_sqft or a residential_sqm column can be supplied, both columns are not required.
Non-Residential Prices¶
Observed data on the sale price of non-residential space provides a price signal for the non-residential real estate market. Based on the supplied data, regression models of non-residential sale prices are estimated and the predicted values are then input to both the supply model and employment location choice model. It is not required to connect these data directly to zones however you may use the optional identification columns below to tie data directly to your other base data tables or use x (longitude) and y (latitude) columns to tie directly to zone locations. Coordinates must be in the World Geodetic System 1984 (WGS84) coordinate system. When coordinates are present in the data, the data will automatically be available to view on the map as a point layer.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
year |
Integer |
Yes |
Year of the transaction |
price |
Float |
Yes |
Predicted or observed price per square foot |
x |
Float |
No |
Longitude coordinate of transaction address or centroid. |
y |
Float |
No |
Latitude coordinate of transaction address or centroid. |
building_type_id |
Integer |
No |
ID of mixed-use or non-residential building |
zone_id |
Integer |
No |
Zone ID of the transaction. Corresponds to the |
non_residential_sqft |
Integer |
No |
Amount of non-residential square footage |
residential_sqm |
Integer |
No |
Amount of residential square meters |
year_built |
Integer |
No |
Year of building construction |
Note
Only one, either a residential_sqft or a residential_sqm column can be supplied, both columns are not required.
Model Calibration Data¶
Under construction.
User Uploaded Scenario Inputs¶
There are a number of zone model scenario inputs that can be uploaded into the platform using the uploader tool. You can then select each data type to use in a scenario.
Annual Household Control Totals¶
A key input to UrbanSim is an assumption about total regional household growth in the forecast period. When composing a scenario, this input can be expressed as a growth rate, or alternatively, as an uploaded control totals table. A household control totals table (annual household control totals) gives the total number of households in the region by year, for every year between the model base-year (2010) and the forecast year (e.g. 2050). This information is typically based on a macro-economic forecast or data from the state demographer. The regional household totals can optionally be broken down by demographic group, giving control over how aggregate household characteristics are transitioned over time in the simulation. You may use any of the household characteristics in the households table to segment your control total table.
Both regional and sub-regional control totals are supported in the control total uploader. If control totals are sub-regional, include a sub-regional location identifier column that corresponds to a location identifier in the zone table (e.g. county_id).
Note
To utilize sub-regional control totals in your model please contact us here.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
year |
Integer |
Yes |
Year |
total_number_of_households |
Integer |
Yes |
Total number of households in year |
total_number_of_people |
Integer |
No |
Optional field, total number of people in household |
[SUB-REGIONAL_LOCATION_ID] |
Integer |
No |
Optional field, required field if applying sub-regional control |
The minimum requirement for a household control totals table is that it contains two columns: year and total_number_of_households. For example:
year |
total_number_of_households |
---|---|
2010 |
1000 |
2011 |
1100 |
2012 |
1200 |
2013 |
1300 |
2014 |
1400 |
For additional control over how aggregate household characteristics transition over time, annual household totals can be segmented by adding additional columns and rows. For example, by adding persons_min and persons_max columns to the table, we segment the total number of households by household size:
year |
total_number_of_households |
persons_min |
persons_max |
---|---|---|---|
2010 |
300 |
1 |
1 |
2010 |
200 |
2 |
2 |
2010 |
100 |
3 |
3 |
2010 |
250 |
4 |
-1 |
2011 |
330 |
1 |
1 |
2011 |
220 |
2 |
2 |
2011 |
110 |
3 |
3 |
2011 |
270 |
4 |
-1 |
The first row indicates that there are 300 households of size one in year 2010. UrbanSim recognizes that persons is an attribute of the households table, and the _min and _max column naming convention communicates to the model that the control total in the first row applies to households with a minimum of one person and a maximum of one person. Similarly, row 2 indicates that there are 200 households of size two in year 2010, and row 5 indicates that in 2011 there are 330 households of size one. In columns with name containing _min or _max, a value of -1 means no limit. So row 4 of the table pertains to households with four or more persons (no upper bound).
Another example of a segmented control totals table, this time where the segmentation of household totals is by income:
year |
total_number_of_households |
income_min |
income_max |
---|---|---|---|
2010 |
500 |
0 |
30000 |
2010 |
900 |
30000 |
-1 |
2011 |
480 |
0 |
30000 |
2011 |
950 |
30000 |
-1 |
This example table buckets households into two income-based categories: those with <=30,000 in income, and those with >30,000 in income. The low-income bucket shrinks from 500 households in 2010 to 480 households in 2011. The high-income bucket grows from 900 households in 2010 to 950 households in 2011. This example region has 1400 (500 + 900) total households in year 2010, and 1430 (480 + 950) total households in year 2011.
Additional columns added to the household control totals table do not have to be expressed in _min and _max terms if the added column pertains to a known household attribute and no minimum/maximum attribute values are needed in the group definitions. For example:
year |
total_number_of_households |
tenure |
---|---|---|
2010 |
700 |
1 |
2010 |
800 |
2 |
2011 |
730 |
1 |
2011 |
820 |
2 |
This table indicates that in year 2010 there are 700 households with tenure = 1 (own), and 800 households of tenure = 2 (rent).
As the examples above show, annual household control totals tables give you the ability to break down regional household totals by detailed demographic characteristics. This table is the mechanism by which one can simulate demographic trends such as decreasing average household size, increasing average income, and increasing average age of household head.
To utilize a control totals table in UrbanCanvas, create as csv file representing the table, and then use the Upload feature to upload and name the table. In Uploads, look for the Household Control Totals uploader. The uploaded control totals table can then be referenced when composing a scenario. Scenarios can reflect alternative demographic assumptions if they point to different control totals table. For example, one scenario can reference a baseline household control totals table, and a different scenario can reference a household control totals table that shows a decline in home-ownership. Remember that household control totals tables need to contain information for every year between the base year and the forecast year.
Annual Employment Control Totals¶
A key input to UrbanSim is an assumption about total regional employment growth in the forecast period. When composing a scenario, this input can be expressed as a single growth rate, or alternatively, as an uploaded control totals table. An employment control totals table (annual employment control totals) gives the total number of jobs in the region by year, for every year between the model base-year (2010) and the forecast year (e.g. 2050). This information is typically based on a macro-economic forecast or data from the state demographer. The regional employment totals can optionally be broken down by sector, giving control over how the sectoral make-up of the economy changes over time in the simulation. You may use the sector_ids as well as any other characteristic available in the jobs table (or if you are using optional establishments, the sector_ids in the user supplied establishments table) to segment your control total table.
Both regional and sub-regional control totals are supported in the control total uploader. If control totals are sub-regional, include a sub-regional location identifier column that corresponds to a location identifier in the zones table (e.g. county_id).
Note
To utilize sub-regional control totals in your model please contact us here.
Column Name |
Data Type |
Required |
Description |
---|---|---|---|
year |
Integer |
Yes |
Year |
total_number_of_jobs |
Integer |
Yes |
Total number of jobs in year |
sector_id |
Integer |
No |
Optional field, required if segmenting by this attribute, |
home_based |
Boolean |
No |
Optional field, required if segmenting by this attribute, |
occupation |
Integer |
No |
Optional field, required if segmenting by this attribute, |
recent_mover |
Boolean |
No |
Optional field, required if segmenting by this attribute, |
[SUB-REGIONAL_LOCATION_ID] |
Integer |
No |
Optional field, required field if applying sub-regional control |
The minimum requirement for an employment control totals table is that it contains two columns: year and total_number_of_jobs. For example:
year |
total_number_of_jobs |
---|---|
2010 |
800 |
2011 |
900 |
2012 |
1000 |
2013 |
1100 |
2014 |
1200 |
For additional control of the employment sector break-down over time, annual employment totals can be segmented by adding an additional column named sector_id:
year |
total_number_of_jobs |
sector_id |
---|---|---|
2010 |
100 |
11 |
2010 |
100 |
21 |
2010 |
100 |
22 |
2010 |
100 |
23 |
2010 |
100 |
3133 |
2010 |
100 |
42 |
2010 |
100 |
4445 |
2010 |
100 |
4849 |
2010 |
100 |
51 |
2010 |
100 |
52 |
2010 |
100 |
53 |
2010 |
100 |
54 |
2010 |
100 |
55 |
2010 |
100 |
56 |
2010 |
100 |
61 |
2010 |
100 |
62 |
2010 |
100 |
71 |
2010 |
100 |
72 |
2010 |
100 |
81 |
2010 |
100 |
92 |
2011 |
200 |
11 |
2011 |
200 |
21 |
2011 |
200 |
22 |
2011 |
200 |
23 |
2011 |
200 |
3133 |
2011 |
200 |
42 |
2011 |
200 |
4445 |
2011 |
200 |
4849 |
2011 |
200 |
51 |
2011 |
200 |
52 |
2011 |
200 |
53 |
2011 |
200 |
54 |
2011 |
200 |
55 |
2011 |
200 |
56 |
2011 |
200 |
61 |
2011 |
200 |
62 |
2011 |
200 |
71 |
2011 |
200 |
72 |
2011 |
200 |
81 |
2011 |
200 |
92 |
Each employment sector in this mock example has 100 regional jobs in year 2010 and 200 regional jobs in year 2011. The annual employment control totals table is the mechanism by which sectoral growth and decline is simulated. For example, one might create a scenario-specific control totals table to simulate the impact of a rapid increase in healthcare jobs over the forecast horizon.
To utilize an employment control totals table in UrbanCanvas, create as csv file representing the table, and then use the upload feature to upload and name the table. In Uploads, look for the Employment Control Totals uploader. The uploaded control totals table can then be referenced when composing a scenario. Scenarios can reflect alternative employment assumptions if they point to different control totals table. For example, one scenario can reference a baseline employment control totals table, and a different scenario can reference an employment control totals table that shows certain sectors growing faster or declining faster. Remember that control totals tables need to contain information for every year between the base year and the forecast year.
Travel Model Zones¶
In cases where the base data zone geography does not correspond to the zones used in your travel model, UrbanCanvas offers a travel model zone uploader. Travel model zones (i.e. traffic analysis zones, TAZs) is the unit of geography used by most regional travel models. UrbanSim is typically run in order to feed land use inputs to a regional travel model, so UrbanSim output needs to be summarized at the travel model zone-level. This can be achieved by downloading the zone-level simulation results and manually aggregating to TAZ. As a convenience, UrbanCanvas offers a travel model zone uploader so that users can upload their TAZ geography, and then UrbanSim will automatically return results at the TAZ level (i.e. do the zone-to-TAZ aggregation). The file format should be a zipped (.zip) shapefile. The only requirement for the shapefile is that it should have an integer field named zone_id with the travel model zone ID values. The TAZ uploader displays the percent of the base data geometry polygons that have been successfully intersected with the TAZ polygons. Once uploaded, travel model zones can be assigned to specific scenarios where post-simulation the resulting indicators will be available as indicators viewable on the map or downloadable as a CSV at the specified TAZ geography. The zone_ids for the TAZ selected for the scenario must correspond to the zone_ids in the corresponding skims that have been selected for the scenario.
Note
Zones are automatically assigned a TAZ ID when a TAZ file is uploaded using a zone polygon largest area overlap by in TAZ polygon intersection operation.
Column Name |
Data Type |
Description |
---|---|---|
zone_id |
Integer |
Travel model zone ID |
Travel Model Skims¶
Travel model skims contain zone to zone travel times (sometimes along with other measures of impedance), as generated by a regional travel model. It is desirable to have skims represented in UrbanSim for the calculation of zonal accessibility metrics, and to facilitate feedback between the travel model and UrbanSim. Skims can be uploaded at any time, but a pre-requisite for using the skims in the model is to already have travel model zones uploaded. Skims should be formatted as a CSV file with, at the minimum, the following fields: from_zone_id, to_zone_id. Each row represents one zone-to-zone pair. Any number of impedance columns with any arbitrary column name are accepted. An example of an impedance measure is: AM peak period travel time from origin zone to destination zone, in minutes.
When used in scenarios, the zone_ids for the skims selected for the scenario must correspond to the zone_ids in the corresponding travel model zones that have been selected for the scenario. A single skim CSV file should represent travel times for a single year (e.g 2010) or travel times for a single range of years (e.g. 2010-2015). The year or year range that each skim represents is specified when skims are selected for use in a scenario. Skim CSV files do not need to include a year column because of the aforementioned method of specifying the skim year in the scenario form. If the CSV does contain a year column, only one single year value can be included. If travel times are different between multiple years, specific skim travel time CSV files for those years should be uploaded as separate skim files that represent each independent year or year range.
Note
Upon upload of base-year skims, contact us here so that we know to re-run the model specification routine for your region and incorporate travel time sensitivity into the location choice models.
If you would like to automate the two way feedback between your travel model and UrbanSim by integrating the two systems, this can be achieved with a consulting contract. See here for more information or contact us here.
Column Name |
Data Type |
Description |
---|---|---|
from_zone_id |
Integer |
Origin zone ID |
to_zone_id |
Integer |
Destination zone ID |
[ARBITRARY_COL_NAME] |
Float |
Any number of columns with any arbitrary |
Regional Zoning¶
Development capacity in the zone model is represented by three zone-level fields: residential_unit_capacity, employment_capacity, and allowed_building_types. These fields are initially set with high placeholder values with no restrictions on the type of buildings allowed representing near-unconstrained development capacity. The primary mechanism for constraining development capacity is the Constraints feature. Constraint records can be added to zones to reflect zoning, undevelopable areas, and other development regulations. As a convenience for users who have a regional zoning layer, an uploader is available to bulk-load region-wide development constraint information to the platform. Upon upload of zoning data, implied zone-level capacities are calculated using the max_dua column to derive residential unit capacity and max_far column with a square meter per job conversion factor set in the constraints user interface to derive employment capacity. Constraint records are created for each zone. The zoning layer should be in the form of a zipped (.zip) shapefile. The shapefile should have the following fields: id, zoning, max_dua, max_far, allowed_building_types. If you have multiple records of polygons with the same zoning designation ID, we suggest you dissolve those polygons to create a set of unique record IDs.
Zoning data is typically one of the more challenging categories of data to collect owing to the many forms it can come in from member jurisdictions. If your regional zoning data may have data gaps, contact us here to review potential solutions.
In jurisdictions that may not utilize FAR in their zoning code, you may be able to derive FAR if the zoning data has lot coverage and building height or stories information. Otherwise, GIS data on zoning, building footprints, and parcels for a sample of locations in the jurisdiction could be used to estimate lot coverage and then with height information can derive an estimate FAR.
In jurisdictions that may not utilize DUA in their zoning code, you may be able to derive DUA through assumptions about average square meters per unit in the jurisdiction from census data or other sources.
Note
The zoning uploader does not yet fully support the conversion of uploaded zoning data to constraint records to be utilized in the zone-level model. Contact us here to request a conversion of your zoning file to constraint records and UrbanSim will apply your file as a new set of constraints. The data schema for constraint bulk uploads can be found here for data represented as zoning data or here for data represented as UrbanSim constraints.
Note
Zones that are completely intersected by a zoning polygon will have that zoning polygon’s max_dua value set for the zone residential capacity and the polygon max_far is converted to employment capacity using this equation (1) or (2). Zones that are partially intersected by one or more zoning polygons are assigned capacities proportional to the area by which the intersecting polygons overlap the zone.
Imperial units:
Metric units:
Column Name |
Data Type |
Unit |
Required |
Description |
---|---|---|---|---|
id |
Integer |
Yes |
Unique identifier for the zoning designation |
|
zoning |
String |
Yes |
Name of zoning designation |
|
min_dua |
Float |
Units per acre or hectare |
Yes |
Minimum dwelling units per acre or hectare |
max_far |
Float |
Yes |
Maximum floor-area ratio (FAR) in this zoning designation |
|
allow_bldg |
String |
Yes |
Semicolon (;) delimited list of building_type_id’s |
Other User Uploaded Data¶
Summary Geography¶
Arbitrary polygon shapefiles can be uploaded and used to tag development projects with the corresponding polygon IDs for use outside of the platform and to generate summary statistics of development project attributes that fall within the polygons. See development projects reports for more information. The summary geography layer should be in the form of a zipped (.zip) shapefile. The shapefile should have the following field: polygon_id
Note
Zones are automatically assigned a summary geography polygon_id when a summary geography file is uploaded using a zone polygon largest area overlap by summary geography polygon intersection operation.
Column Name |
Data Type |
Description |
---|---|---|
polygon_id |
Integer |
Polygon ID |
Bulk upload of development projects¶
Bulk upload of constraints¶
See the bulk upload of constraints page.
Bulk upload of adjustments¶
See the bulk upload of adjustments page.
Footnotes
References
Ye, Xin, Karthik Konduri, Ram M. Pendyala, Bhargava Sana, and Paul Waddell. 2009. “A Methodology to Match Distributions of Both Household and Person Attributes in the Generation of Synthetic Populations.” Transportation Research Board 88th Annual Meeting.