Chapter 28: Open Data and Public Data Sources

Chapter Overview

This chapter introduces open data ecosystems and public data sources available for Community Mapping. It covers discovery, licensing, access, and ethical use of datasets from municipal, provincial, state, national, and international sources — including census, public health, transportation, environmental, housing, and economic data. The chapter emphasizes critical data literacy: open data is powerful but incomplete, often reflecting institutional priorities and political choices rather than lived reality. Effective Community Mapping requires knowing what public data can and cannot show, and where to look when official datasets fail.

Learning Outcomes

By the end of this chapter, you will be able to:

Define open data and explain its relationship to Community Mapping
Identify major municipal, provincial, national, and international open data sources
Evaluate data licensing, update frequency, granularity, and accessibility for mapping purposes
Navigate census, public health, transportation, environmental, housing, and economic data portals
Recognize the structural limits of public data and the communities most often undercounted or excluded
Apply critical data literacy to assess quality, bias, and gaps in official datasets
Inventory and evaluate open data sources for a chosen community

Key Terms

Open Data: Data made freely available by governments or institutions for reuse without restrictive licensing, often in machine-readable formats.
Data Portal: A centralized online platform where governments or organizations publish datasets, APIs, and documentation for public access.
Data Granularity: The level of detail or resolution in a dataset (e.g., national, provincial, municipal, census tract, postal code, address-level).
Open Government Data Charter: An international framework promoting open data as a public good, adopted by governments worldwide beginning in 2013.
Census Geography: Standardized spatial units (e.g., dissemination areas, census tracts, counties) used to aggregate population data.
Data Licensing: Legal terms defining how data may be used, shared, modified, and attributed (e.g., Creative Commons, Open Government License).

28.1 What Is Open Data?

Open data is data made freely available by governments, institutions, or organizations for reuse without restrictive licensing. The defining principle is that open data can be accessed, used, modified, and shared by anyone, for any purpose — including commercial use — with minimal restrictions beyond attribution or integrity requirements.

Open data is not the same as "public information." Governments have long published reports, statistics, and maps — but often in formats designed for reading, not reuse. A PDF report is public, but it is not open data. A locked-down web map you can view but not download is not open data. Open data means the raw datasets themselves are available, typically in structured, machine-readable formats like CSV, JSON, GeoJSON, Shapefile, or API endpoints.

The open data movement emerged in the early 2000s, driven by demands for government transparency, accountability, and civic innovation. The Open Government Data Charter, adopted internationally in 2013, established core principles: data should be open by default, timely, accessible, comparable across jurisdictions, and available for improved governance and citizen engagement. By the mid-2020s, thousands of municipalities, regional governments, national governments, and international organizations had launched open data portals.

For Community Mapping, open data is a foundational resource. It provides baseline demographic, geographic, economic, environmental, and institutional data that would be prohibitively expensive or time-consuming to collect independently. Census data reveals population characteristics. Transit data shows routes and schedules. Zoning data maps land use regulations. Environmental data documents air quality, water quality, and climate risks. Service directory data lists nonprofit locations and contact information.

But open data is not neutral, comprehensive, or sufficient on its own. What gets published as open data reflects institutional priorities, political pressures, and capacity constraints. Some governments publish dozens of high-quality datasets updated monthly; others publish a handful of outdated spreadsheets. Some publish granular, address-level data; others only aggregate to large geographic units. Some focus on infrastructure and services; others prioritize economic indicators. What is left out of open data — informal economies, undocumented residents, sacred sites, grassroots organizing — is often as revealing as what is included.

Open data is a tool. Like any tool, it can be used well or poorly, ethically or exploitatively. This chapter teaches you how to find it, evaluate it, and use it — while staying alert to its limits.

28.2 Municipal Open Data

Municipal open data portals are often the most useful starting point for Community Mapping at neighborhood, ward, or city scale. Municipal datasets are typically more granular, more locally relevant, and more frequently updated than provincial or national data.

What municipal open data portals publish:
Most large and mid-sized Canadian municipalities publish datasets on infrastructure (roads, sidewalks, bike lanes, public buildings), transit (routes, stops, schedules, ridership), land use (zoning, building permits, property assessments), environment (tree inventory, parks, waste collection routes, air quality monitors), services (recreation centers, libraries, fire stations, community centers), and civic operations (budget data, 311 service requests, election results). Some publish business license registries, restaurant inspection results, crime statistics, and real-time data like parking availability or snow plow locations.

Examples of Canadian municipal portals:
Toronto's Open Data Portal (open.toronto.ca) publishes 500+ datasets including TTC routes, neighborhood profiles, 311 requests, and tree canopy data. Vancouver's Open Data Portal (opendata.vancouver.ca) includes development permits, street trees, public art, and community center locations. Halifax, Calgary, Edmonton, Ottawa, and Montreal all maintain municipal open data portals with dozens to hundreds of datasets. Smaller municipalities often publish fewer datasets but may include essential service locations, transit, and land use data.

Typical formats and access:
Municipal datasets are usually downloadable as CSV, Shapefile, GeoJSON, or KML. Many portals also provide API access for real-time or programmatic data retrieval. Most use open licenses (e.g., Open Government License – Canada or Creative Commons) that permit reuse with attribution.

Common gaps in municipal open data:
Municipalities rarely publish data on informal community assets (grassroots groups, mutual aid networks, community gardens), disaggregated demographic data (due to privacy concerns or lack of collection), tenant data (landlords, evictions, rental prices), or sensitive operational data (child welfare, policing tactics). Service location data often omits privately-funded or informal services. Address-level data may be suppressed for privacy or security reasons.

Best practices for using municipal open data:
Always check the "last updated" date. A dataset from 2015 may no longer reflect current conditions. Read the metadata and data dictionary to understand definitions, units, and limitations. Validate findings against on-the-ground observation or local knowledge. Cross-reference multiple datasets where possible (e.g., census demographics + municipal service locations + community input). Contact the municipal open data team if datasets are missing, outdated, or unclear — they often respond to user requests.

Municipal open data is powerful for mapping access, service distribution, infrastructure gaps, and land use patterns. But it reflects what the municipality tracks and prioritizes, not necessarily what communities care most about.

28.3 Provincial, State, and National Data

Provincial, state, and national datasets provide broader geographic coverage, longer time series, and population-level indicators that municipal data cannot. They are essential for understanding regional trends, comparing communities, and analyzing demographic or economic patterns at scale.

Provincial and territorial data in Canada:
Each Canadian province and territory operates its own open data portal. Examples include Ontario's Data Catalogue (data.ontario.ca), BC Data Catalogue (catalogue.data.gov.bc.ca), and Alberta Open Data (open.alberta.ca). Provincial datasets typically include health system data (hospital locations, public health units, immunization rates), education data (school locations, enrollment, performance metrics), justice data (court locations, recidivism rates), natural resource data (forestry, mining, water use), and economic data (employment, business registrations, exports).

Provincial datasets are often more comprehensive than municipal data but less granular. A provincial school dataset may include enrollment and location but not playground safety or after-school programs. A health system dataset may show hospital beds but not wait times or cultural competency.

National data in Canada:
The Government of Canada's open data portal (open.canada.ca) publishes datasets from federal departments and agencies. Key sources include Statistics Canada (census, labor force surveys, health indicators), Infrastructure Canada (funding allocations), Environment and Climate Change Canada (climate data, species at risk), Transport Canada (aviation, rail, marine data), and Immigration, Refugees and Citizenship Canada (settlement services, immigration statistics).

State and national data in the United States:
The U.S. federal government operates Data.gov, one of the world's largest open data portals, with 250,000+ datasets from agencies including the Census Bureau, Bureau of Labor Statistics (BLS), Bureau of Transportation Statistics (BTS), Environmental Protection Agency (EPA), Department of Housing and Urban Development (HUD), Centers for Disease Control and Prevention (CDC), and National Oceanic and Atmospheric Administration (NOAA). Each U.S. state operates its own open data portal with varying scope and quality.

Access and licensing:
Most Canadian provincial and federal datasets use the Open Government License – Canada, which permits reuse with attribution. U.S. federal datasets are typically in the public domain. State and provincial licensing varies but is generally permissive.

Using national and provincial data for Community Mapping:
National and provincial datasets work best for contextualizing local findings, comparing communities, identifying regional disparities, and analyzing trends over time. Census data (discussed in section 28.4) is the backbone of most demographic analysis. Economic indicators help assess labor markets and income distribution. Environmental data supports climate resilience planning. Health data informs service access analysis.

But national and provincial data are often aggregated to large geographic units (e.g., health regions, electoral districts) that obscure neighborhood-level variation. A provincial dataset showing low unemployment may hide pockets of severe job loss. National climate data may miss hyper-local heat island effects. Effective Community Mapping combines provincial/national baselines with local granularity.

28.4 Census Sources

Census data is the single most important public data source for demographic analysis in Community Mapping. Census datasets provide population counts, age distributions, household composition, income, education, language, immigration status, Indigenous identity, housing tenure, employment, and commuting patterns — all at fine geographic resolution.

Statistics Canada Census:
Canada conducts a national census every five years (most recently 2021, next in 2026). Statistics Canada publishes census data through multiple access points: data tables (searchable by topic and geography), census profiles (pre-formatted summaries for specific places), thematic maps (choropleth visualizations), and boundary files (GIS-ready geographies). Census data is available at multiple scales: national, provincial/territorial, census division, census subdivision (municipalities), census tract, dissemination area (DA), and dissemination block (DB).

Census tracts typically contain 2,500–8,000 people and are designed for urban areas. Dissemination areas contain 400–700 people and are the smallest standard geography for which all census data is released. Dissemination blocks are the smallest units but have limited data due to privacy suppression rules.

U.S. Census Bureau:
The United States conducts a decennial census (most recently 2020, next in 2030) and an annual American Community Survey (ACS) that provides detailed demographic estimates between census years. Census data is available at multiple scales: national, state, county, census tract, block group, and block. The Census Bureau also provides specialized datasets on business patterns, commuting flows, and economic indicators.

Census limitations:
As discussed in Chapter 20.1, census data systematically undercounts marginalized populations including people experiencing homelessness, undocumented residents, Indigenous people in remote areas, and highly mobile populations. Census categories (e.g., for race, ethnicity, gender, family structure) reflect institutional definitions that may not match community self-identification. Census data is always several years old by the time it is released, and demographic change may have occurred since collection.

For Community Mapping, census data provides essential baseline demographics — but should be cross-validated with local knowledge, recent administrative data (e.g., school enrollment trends, housing starts), and qualitative input from residents. Census data shows patterns; it does not explain causes or capture lived experience.

Accessing census data:
In Canada, use Statistics Canada's Census Program website (statcan.gc.ca/census) to search data tables, download boundary files, and access the Census Mapper tool. In the U.S., use the Census Bureau's data.census.gov portal or the IPUMS-USA interface for advanced queries. Most GIS platforms include tools to import census data directly by geography.

28.5 Public Health Sources

Public health data supports Community Mapping focused on health equity, disease burden, healthcare access, environmental health, and social determinants of health.

Canadian public health data:
The Public Health Agency of Canada (PHAC) publishes national data on infectious diseases, chronic disease prevalence, vaccination rates, and health inequities. Provincial health authorities publish data on hospital locations, emergency department wait times, ambulance response zones, public health unit boundaries, and community health indicators. Some provinces (e.g., Ontario, British Columbia) publish neighborhood-level health profiles showing disease rates, life expectancy, and risk factors by census tract or health region.

U.S. public health data:
The Centers for Disease Control and Prevention (CDC) publishes datasets on disease surveillance, health behaviors (via the Behavioral Risk Factor Surveillance System, BRFSS), environmental health hazards, and social determinants of health. County-level health rankings (from the University of Wisconsin Population Health Institute) provide comparative health indicators. State health departments publish hospital and clinic locations, health disparities reports, and environmental health data.

What public health data includes:
Typical datasets include: hospital and clinic locations (including emergency departments, community health centers, and specialty services), disease incidence and prevalence (e.g., diabetes, heart disease, respiratory illness), mortality data (by cause and geography), vaccination coverage, maternal and child health indicators (e.g., birth outcomes, breastfeeding rates), substance use data (overdose rates, treatment access), and environmental health monitoring (air quality, water quality, lead exposure, vector-borne disease).

Limits of public health data:
Public health datasets often aggregate to large geographic units (health regions, counties) to protect privacy, obscuring neighborhood-level disparities. Data on marginalized populations (e.g., people experiencing homelessness, sex workers, undocumented residents) is sparse or absent. Mental health data is particularly limited due to stigma and fragmented service systems. Indigenous health data is often incomplete, colonial in framing, or withheld due to past misuse.

Using public health data for Community Mapping:
Public health data is most useful when combined with social determinants data (income, education, housing, food access) to understand upstream causes of poor health outcomes. A map showing high diabetes rates becomes actionable when layered with food desert data, walkability scores, and recreation access. Public health data can also identify vulnerable populations for targeted outreach (e.g., seniors living alone in heat-vulnerable areas) or service gaps (e.g., neighborhoods far from mental health services).

As Chapter 25 emphasizes, always validate public health data findings with local providers and community members. Official data may lag real-time conditions or miss informal health supports.

28.6 Transportation Sources

Transportation data supports Community Mapping focused on mobility, accessibility, equity, and climate.

Transit data:
Most public transit agencies publish General Transit Feed Specification (GTFS) data, a standardized format for routes, stops, schedules, and real-time service status. GTFS data is available for most Canadian and U.S. transit systems and can be imported into GIS or routing tools. Some agencies also publish ridership statistics, service reliability metrics, and accessibility features (e.g., wheelchair-accessible stops).

Road and active transportation data:
Municipalities and regional governments publish road network data (including street centerlines, speed limits, traffic volumes, and collision locations), sidewalk and crosswalk inventories, bike lane networks, and pedestrian infrastructure. Some cities publish walkability or bikeability scores, traffic calming measures, and Vision Zero (road safety) data.

Transportation planning data:
Regional transportation authorities publish long-range transportation plans, traffic modeling data, commuter flow patterns, park-and-ride locations, and transportation project priorities. Census data includes commuting mode share (e.g., car, transit, bike, walk) and travel times by origin-destination.

What transportation data reveals:
Transportation datasets can map transit deserts (areas with low service frequency or long walking distances to stops), accessibility barriers (lack of sidewalks, steep grades, missing curb cuts), collision hotspots, and mode-shift potential (areas where active transportation is feasible but infrastructure is absent). They can also assess equity: whether low-income neighborhoods have comparable transit access to wealthier areas.

Limits of transportation data:
Official data focuses on formal infrastructure and services. It rarely captures informal transportation (e.g., ride-sharing within immigrant communities, volunteer driver networks, shuttle services provided by religious institutions). It may not include subjective barriers like safety concerns, harassment on transit, or cultural discomfort. Accessibility data often focuses on wheelchair access but not other disabilities (e.g., visual impairment, cognitive disabilities).

Using transportation data for Community Mapping:
Transportation data is most powerful when combined with demographic and service location data. A map showing seniors with mobility limitations living far from grocery stores and without transit access identifies a concrete intervention point. A map showing high pedestrian injury rates in neighborhoods with incomplete sidewalks supports infrastructure advocacy. Transportation equity analysis can compare transit service levels and travel times across income or racial demographics to document disparities.

28.7 Environmental Sources

Environmental data supports Community Mapping focused on climate resilience, environmental justice, public health, and land stewardship.

Climate and weather data:
Environment and Climate Change Canada publishes historical and forecast weather data, climate normals (30-year averages), and climate change projections. The U.S. National Oceanic and Atmospheric Administration (NOAA) provides similar datasets. Many cities publish urban heat island maps, tree canopy data, and climate vulnerability assessments.

Air and water quality data:
Environmental agencies publish air quality monitoring data (PM2.5, ozone, nitrogen dioxide), water quality data (for drinking water systems, rivers, lakes, and coastal areas), and source pollution inventories (industrial emissions, contaminated sites). Some jurisdictions publish environmental justice screening tools that combine pollution data with demographic data to identify overburdened communities.

Natural hazards and risk data:
Governments publish flood risk maps, wildfire hazard zones, earthquake risk data, coastal erosion projections, and landslide susceptibility maps. Many jurisdictions now publish climate adaptation plans with mapped vulnerabilities (e.g., heat-vulnerable neighborhoods, flood-prone infrastructure).

Land use and biodiversity data:
Environmental datasets include protected area boundaries, wetland inventories, species at risk habitat, forest cover, agricultural land classifications, and invasive species monitoring.

What environmental data supports:
Environmental datasets help Community Mapping projects assess exposure to environmental hazards (e.g., air pollution, flood risk, extreme heat), identify green infrastructure gaps (e.g., lack of tree canopy in low-income neighborhoods), support climate adaptation planning, and document environmental injustices (disproportionate exposure of marginalized communities to pollution or hazards).

Limits of environmental data:
Monitoring networks are unevenly distributed, often favoring wealthier urban areas over low-income neighborhoods, rural areas, or Indigenous territories. Cumulative exposure (multiple pollutants or stressors acting together) is rarely captured. Citizen science and community-based monitoring often fill gaps but may not be integrated into official datasets.

Using environmental data for Community Mapping:
Environmental data is most actionable when combined with demographic and health data. A map showing heat vulnerability should layer surface temperature data with senior populations, low-income households, housing quality (e.g., lack of air conditioning), and tree canopy coverage. Environmental justice analysis requires comparing pollution exposure or hazard risk across racial and economic demographics.

28.8 Housing Sources

Housing data is critical for Community Mapping focused on affordability, displacement risk, homelessness, housing quality, and equitable development.

Census housing data:
Census datasets include housing tenure (owned vs. rented), housing type (single-family, apartment, etc.), housing costs (mortgage or rent as percentage of income), overcrowding, and core housing need (households spending >30% of income on housing that does not meet adequacy, suitability, or affordability standards). Canadian census also includes structural condition indicators and Indigenous housing data.

Municipal housing data:
Many municipalities publish building permit data, development applications, property assessments, zoning maps, and affordable housing inventories. Some cities publish data on supportive housing locations, emergency shelters, and encampment counts.

National housing data:
In Canada, the Canada Mortgage and Housing Corporation (CMHC) publishes rental market data (vacancy rates, average rents, rental housing starts) and housing market indicators. In the U.S., the Department of Housing and Urban Development (HUD) publishes data on subsidized housing locations, Fair Market Rents, homelessness counts, and housing discrimination complaints.

What housing data reveals:
Housing datasets can map affordability crises (neighborhoods where median rent exceeds 30% of median income), displacement risk (rapid rent increases, property speculation, evictions), housing quality issues (structural deficiencies, overcrowding), and gaps in affordable or supportive housing supply.

Limits of housing data:
Official data systematically undercounts informal housing (e.g., basement suites, garage conversions, couch-surfing, vehicles), hidden homelessness (people staying temporarily with friends or family), and housing in unregulated or illegal conditions. Eviction data is often incomplete or inaccessible. Landlord ownership data (to identify corporate concentration or speculation) is rarely public. Indigenous housing data is often collected using colonial definitions that do not reflect community standards or needs.

Using housing data for Community Mapping:
Housing data is most powerful when combined with displacement indicators (gentrification, property flipping, rent escalation), demographic vulnerability (low-income households, seniors, families with children), and service locations (shelters, legal aid, tenant advocacy). Anti-displacement mapping can identify neighborhoods at risk before displacement accelerates, supporting proactive policy intervention.

28.9 Economic Sources

Economic data supports Community Mapping focused on employment, income, local business ecosystems, economic development, and economic justice.

Census economic data:
Census datasets include labor force participation, unemployment, occupation, industry, income (median household income, income distribution, low-income rates), and commuting patterns.

National economic data:
In Canada, Statistics Canada publishes labor force survey data (monthly employment statistics), business patterns data (business counts by sector and size), and trade data. In the U.S., the Bureau of Labor Statistics (BLS) publishes employment and wage data by occupation, industry, and geography. The U.S. Census Bureau's County Business Patterns provides business counts and employment by industry.

Municipal business data:
Some municipalities publish business license registries (showing business name, location, and type), commercial vacancy data, and local procurement spending.

What economic data reveals:
Economic datasets can map employment concentrations, job deserts (areas with few local jobs), wage disparities, business ecosystem diversity (or lack thereof), and economic vulnerability (e.g., neighborhoods with high unemployment and low business density).

Limits of economic data:
Official data systematically excludes informal economies — the cash-based work, gig labor, barter networks, and under-the-table employment that sustain many marginalized communities. As Hernando de Soto documented in The Other Path (1989), informal economies are often larger and more dynamic than formal sectors in low-income areas, yet they remain invisible in official statistics. Self-employment data is incomplete. Worker misclassification (e.g., gig workers labeled as independent contractors) distorts employment counts. Small business data often lags reality due to reporting delays.

Using economic data for Community Mapping:
Economic data is most useful when combined with community knowledge of informal economies. A neighborhood that appears economically depressed in census data may have vibrant informal markets, home-based businesses, and mutual aid networks. Local economic mapping should include interviews, business surveys, and participatory asset mapping to capture what official data misses. Economic development mapping can identify underserved markets, business clustering opportunities, or workforce development needs.

28.10 Limits of Public Data

Public data is powerful, but it is not comprehensive, neutral, or sufficient on its own. Understanding what public data cannot show is as important as knowing what it can.

Marginalized communities are systematically undercounted.
People experiencing homelessness, undocumented residents, highly mobile populations, and those living in informal or overcrowded housing are consistently missed or undercounted in census and administrative data. Indigenous communities, especially in remote or northern areas, may be undercounted or aggregated in ways that obscure community-specific needs. Racialized communities, low-income households, and people with precarious immigration status often distrust government data collection and may avoid participation.

Informal economies are invisible.
Cash-based work, gig labor, under-the-table employment, barter networks, community care (unpaid caregiving, childcare, elder care), and grassroots mutual aid do not appear in official economic statistics. Yet these informal systems are often the primary economic infrastructure for marginalized communities. Mapping that relies solely on public data will systematically misrepresent economic activity in low-income and immigrant neighborhoods.

Private-sector data is hoarded.
Corporations, tech platforms, and private service providers collect vast amounts of data on purchasing, mobility, social connections, health, and behavior — data that would be invaluable for Community Mapping. But this data is proprietary, inaccessible, and often exploitative in its collection. Public data cannot fill these gaps, yet private actors use data asymmetry to extract profit and power from communities.

Data portals are frequently abandoned.
Not all open data portals are maintained. Datasets may be published once and never updated. Links rot. Formats change without notice. Smaller municipalities and under-resourced agencies often lack capacity to maintain data infrastructure. A dataset from 2017 labeled "current" may no longer reflect reality. Always check last-updated dates and validate findings against recent observations.

Equity-disaggregated data is often suppressed.
When data is disaggregated by race, income, disability, or other equity dimensions, it has power to reveal disparities — and to create political pressure for change. For this reason, equity data is often suppressed under the justification of "privacy protection," even when aggregation methods could safely release it. In practice, suppression often protects institutions from accountability, not individuals from harm.

Sacred and sensitive knowledge must not be public.
Not all community knowledge should appear in open data. Indigenous sacred sites, culturally significant locations, medicinal plant harvesting areas, and places of spiritual importance must not be mapped publicly without explicit, informed, ongoing consent from the communities who hold authority over that knowledge. Open data frameworks often assume transparency is always good; this assumption is colonial and dangerous. As discussed in Chapter 9, some places must remain protected, and data sovereignty must rest with the communities whose knowledge it is.

Data definitions reflect institutional priorities, not community realities.
What gets counted, how it is categorized, and at what scale it is reported all reflect the priorities and assumptions of the institutions collecting the data. A "household" may not match how families define themselves. A "park" in city data may be a fenced-off lawn with no amenities, not a usable community space. A "service" may be listed as available but culturally unwelcoming or operationally inaccessible. Public data is always mediated by the perspective of those with power to define categories.

Effective Community Mapping recognizes these limits and compensates for them.
This means combining public data with participatory data collection, qualitative research, community validation, and local knowledge. It means being transparent about what your map shows and what it does not. It means questioning the categories, definitions, and geographic units that official data imposes. It means centering the voices and authority of those who live the reality that data attempts to represent — and often misses.

28.11 Synthesis and Implications

This chapter has introduced the landscape of open and public data available for Community Mapping, from municipal and census sources through specialized datasets on health, transportation, environment, housing, and economics. These datasets are essential tools — but they are tools, not truth.

The core implications to carry forward:

Open data provides baseline infrastructure. Census demographics, service locations, transit networks, environmental hazards, and land use data form the foundation for most Community Mapping. Know where to find it, how to access it, and how to integrate it into GIS workflows.
Not all data portals are equal. Some jurisdictions publish high-quality, frequently updated, granular datasets with good documentation. Others publish minimal, outdated, poorly documented data. Assess each dataset critically: when was it last updated? What is its geographic resolution? Who collected it and why? What is missing?
Licensing matters. Most Canadian and U.S. public data uses permissive open licenses, but always check before using data for advocacy, commercial purposes, or redistribution. Attribution requirements vary.
Public data reflects institutional priorities. What governments choose to publish reveals what they value and monitor. What they omit reveals what they ignore or suppress. Reading the gaps in open data is as important as reading the data itself.
Marginalized communities are systematically undercounted. People experiencing homelessness, undocumented residents, informal economies, hidden housing, and communities that distrust government data collection are invisible in most public datasets. Community Mapping that relies solely on official data will reproduce this invisibility.
Validation is essential. Ground-truth public data findings against on-the-ground observation, local knowledge, service provider input, and participatory research. A dataset may be accurate in what it counts but miss what matters most.
Equity analysis requires combining multiple datasets. No single dataset reveals disparities. Mapping equity requires layering demographic data with service access, environmental exposure, housing quality, economic opportunity, and health outcomes — and interpreting those patterns through a justice lens.
Data sovereignty and consent are non-negotiable. Not all community knowledge should be public. Indigenous data governance frameworks (OCAP and related principles) affirm that communities have authority over their own data. Always ask: whose knowledge is this, who has the right to share it, and what harm might result from public disclosure?

Public data is a starting point, not an endpoint. The most powerful Community Mapping combines the scale and structure of public datasets with the depth and nuance of participatory, qualitative, and community-controlled knowledge.

28.12 Open Data Inventory

Purpose:
This exercise builds practical data literacy by having you inventory, evaluate, and document open data sources relevant to a chosen community. You will learn to navigate data portals, assess data quality, identify gaps, and compile a reusable resource for future Community Mapping work.

Materials Needed:

Computer with internet access
Spreadsheet software (Excel, Google Sheets, LibreOffice Calc)
Template: Open Data Inventory Spreadsheet (columns: Dataset Name | Source | URL | Geographic Coverage | Granularity | Format | License | Last Updated | Update Frequency | Contact | Notes | Usability Rating)

Steps:

Choose a community. Select a municipality, neighborhood, region, or Indigenous territory you want to map. Be specific (e.g., "City of Halifax" or "Jane-Finch neighborhood, Toronto").
Identify relevant data portals. Find and bookmark the open data portals for:
- The municipality (if applicable)
- The province/state
- The national government (open.canada.ca or Data.gov)
- (Optional) Regional authorities (e.g., health region, transit agency, conservation authority)
Inventory at least 10 datasets relevant to Community Mapping in your chosen area. Prioritize diversity: include census, health, transportation, environment, housing, and/or economic datasets. For each dataset, record:
- Dataset Name: Official title
- Source: Organization that published it
- URL: Direct link to dataset or portal page
- Geographic Coverage: What area does it cover? (e.g., entire province, specific municipality, census tracts)
- Granularity: What is the smallest geographic unit? (e.g., address-level, postal code, dissemination area, municipality)
- Format: File format (CSV, Shapefile, GeoJSON, API, PDF, etc.)
- License: What license governs use? (e.g., Open Government License, Creative Commons, Public Domain)
- Last Updated: When was the dataset last updated?
- Update Frequency: How often is it updated? (e.g., monthly, annually, one-time)
- Contact: Email or contact point for questions
- Notes: Key limitations, surprises, or observations (e.g., "missing Indigenous community data," "excellent metadata," "broken download link")
- Usability Rating: Your assessment (1-5 scale: 1 = unusable, 5 = excellent)
Identify gaps. Review your inventory and write a 1-paragraph reflection on:
- What kinds of data are abundant?
- What kinds of data are missing or inadequate?
- Which communities or topics are well-represented?
- Which communities or topics are invisible?
Test one dataset. Download one dataset in a format you can open (CSV or Shapefile). Open it, review its structure, and write 2-3 sentences assessing its quality: Is it well-documented? Are column names clear? Are there obvious errors or missing values? Is it usable for mapping?

Deliverable:
A completed Open Data Inventory spreadsheet with at least 10 datasets, plus a 1-page written reflection on gaps, surprises, and lessons learned.

Time Estimate: 2-3 hours

Safety and Ethics Notes:
Do not publish datasets that contain personal information, even if they are publicly available. If you find data that should not be public (e.g., addresses of vulnerable individuals, culturally sensitive locations), do not share it and consider reporting the issue to the data publisher. Document only what is openly and ethically available.

Key Takeaways

Open data provides essential baseline datasets for Community Mapping, including census demographics, service locations, infrastructure, environmental data, and economic indicators.
Municipal, provincial, national, and international data portals vary widely in quality, granularity, update frequency, and accessibility — always assess critically.
Census data is foundational but systematically undercounts marginalized populations and reflects institutional definitions that may not match community realities.
Public data is not neutral or comprehensive — it reflects institutional priorities, omits informal economies and hidden populations, and often suppresses equity-disaggregated data.
Effective Community Mapping combines public data with participatory research, local knowledge, qualitative inquiry, and community validation to fill gaps and center lived experience.
Data sovereignty and consent are non-negotiable — not all knowledge should be public, and communities must retain authority over their own data.

Plain-Language Summary

Open data means information that governments, institutions, and organizations make freely available for anyone to use. For Community Mapping, open data includes things like census information (how many people live where, their ages, incomes, languages), service locations (where are the libraries, clinics, transit stops), environmental data (air quality, flood risks, green space), and economic data (businesses, jobs, income levels).

Open data is useful because it gives you a starting point — baseline facts about a community that would be hard to collect yourself. But open data is not complete or neutral. Governments publish data about what they track and care about, which often means they leave out informal community assets, undocumented residents, informal economies, and the knowledge held by people who don't trust government data collection.

Good Community Mapping uses open data as one tool among many. You combine official datasets with local knowledge, participatory research, and validation from people who live in the community. You question what's missing from the data. You check whether the data is up-to-date. And you make sure that sensitive or sacred knowledge stays protected and in community control, not published openly where it could cause harm.

End of Chapter 28.

Chapter 28. Open Data and Public Data Sources