What is secondary emissions data?

When building a carbon footprint, companies rarely have direct data for every emission source. Secondary emissions data fills those gaps using published averages and emissions factors rather than figures collected from a specific supplier or meter. Understanding what it is, where it comes from, and when to replace it helps you judge how much to trust a carbon footprint, including your own.

Quick Answer: Secondary emissions data refers to emissions figures derived from published average datasets or emissions factors, rather than direct measurement or supplier-specific information. It is used in carbon accounting when primary data (actual activity or spend data from a specific source) is unavailable. Secondary data is a legitimate and widely used input under the GHG Protocol, but it carries more uncertainty than primary data and should be replaced over time as data quality improves.

What Is Secondary Emissions Data?

Secondary emissions data is emissions information taken from third-party databases, published research, or industry-average emissions factors, rather than collected directly from the source of the emissions. In carbon accounting, it sits in contrast to primary emissions data, which comes from actual measurements, invoices, meter readings, or supplier-specific product carbon footprints.

When a company calculates its carbon footprint, it rarely has direct data for every emission source. Secondary data fills those gaps. A company that cannot get electricity consumption figures for a rented office, for example, might use a floor-area-based emissions factor derived from published building energy benchmarks. The result is an estimate, but a structured, methodology-aligned one.

Secondary data is not a shortcut or a sign of poor practice. The GHG Protocol explicitly accommodates its use, particularly in early-stage footprints or where primary data collection is disproportionately difficult.

Where Does Secondary Emissions Data Come From?

Secondary data is drawn from a range of published sources. The most commonly used include:

National government databases, such as the UK Government's Greenhouse Gas Conversion Factors (published annually by DESNZ and DBEIS), which provide emissions factors for electricity, fuel, transport, and waste.
The IPCC emissions factor database, used for reference across many international frameworks.
Ecoinvent, a widely used life cycle inventory database covering thousands of materials, processes, and supply chain activities.
Industry-specific databases, including sector average figures published by trade bodies or research institutions.‍
Spend-based emissions factors, which convert financial spend in a given category into an estimated emissions figure using economic input-output models.

The reliability of secondary data varies by source. Factors from well-maintained government databases (updated annually to reflect grid changes, for example) are generally more reliable than older or less frequently updated sources. The age of the data matters: an emissions factor from 2015 for electricity generation may not reflect the current grid mix and will produce less accurate results.

How Is Secondary Data Used in a Carbon Footprint?

Secondary data is used across all three scopes of a GHG inventory, but it is most prevalent in Scope 3, where direct data from suppliers and value chain partners is often unavailable.

In practice, secondary data is applied by multiplying an activity metric (a quantity of something: kilometres travelled, tonnes of material purchased, pounds spent in a category) by a published emissions factor. The result is a CO2e figure for that activity.

For example:

Tonnes of waste sent to landfill x a landfill emissions factor = Scope 3 waste emissions
Total spend on business flights x a spend-based air travel factor = an estimated Scope 3 travel figure
Kilowatt-hours of electricity consumed x a grid emissions factor = Scope 2 emissions

The accuracy of the output depends on how closely the activity metric matches the assumptions built into the emissions factor. A spend-based factor for "road freight" will produce a broader estimate than an activity-based factor that uses actual tonne-kilometres. This is why secondary data is often described as a starting point, not a final answer.

Why Does the Distinction Between Primary and Secondary Data Matter?

The distinction matters because it directly affects the accuracy and credibility of a carbon footprint. A footprint built entirely on secondary data is less precise than one that incorporates primary data for the highest-emitting activities. For companies seeking external assurance, setting Science Based Targets, or responding to procurement requirements such as PPN 006, the quality of underlying data becomes a practical concern, not just a technical one.

Frameworks including the GHG Protocol's Corporate Value Chain (Scope 3) Standard encourage companies to prioritise primary data for significant emission sources and use secondary data where primary data is genuinely difficult to obtain. The principle is proportionality: invest data collection effort where it has the most impact on accuracy.

This is also why a good carbon accounting process does not treat a first-year footprint as the finished article. Secondary data establishes a credible baseline. Over time, replacing secondary data with supplier-specific figures, actual energy bills, or product-level carbon data improves the footprint's accuracy and makes year-on-year progress more meaningful.

What Are the Limitations of Secondary Emissions Data?

Secondary data introduces uncertainty at several points:

Representativeness. Industry-average factors may not reflect a specific supplier's actual emissions. A company buying steel from a producer using electric arc furnaces will have a lower real-world footprint than a factor based on the global average steel production mix would suggest.

Timeliness. Emissions factors are updated periodically, not continuously. A factor that does not reflect recent changes in energy infrastructure or production methods will produce results that diverge from reality over time.

Geographic specificity. Many secondary datasets are national or regional averages. Using a UK grid factor for operations in a country with a different energy mix will reduce accuracy. Where region-specific factors are available, they should be used in preference to global averages.

Spend-based factor limitations. Spend-based secondary data is particularly broad. It converts financial spend into emissions using economic relationships that do not account for what was actually purchased, how it was produced, or where it came from. It is useful for establishing a high-level picture of Scope 3 categories where no other data exists, but it should not be the long-term basis for reporting on material emission sources.

None of these limitations make secondary data invalid. They make it important to document which factors were used, from which source, and in which version, so that the assumptions behind a footprint are transparent and auditable.

How Should Companies Manage Secondary Data Quality Over Time?

The practical goal is a data improvement roadmap: a clear plan for which secondary data inputs to replace with primary data, in which order, based on their contribution to total emissions.

A company might spend its first year using secondary data across most Scope 3 categories to establish a baseline. In year two, it targets its top three emission sources (often purchased goods and services, business travel, and freight) and works to collect primary data from key suppliers or internal systems. By year three, the footprint is meaningfully more accurate, and the trajectory of improvement is itself a signal of credibility.

Seedling supports this process by identifying where secondary data is currently in use across a footprint, flagging high-impact categories where data quality improvements would have the most effect, and providing a clear audit trail of the factors and sources applied. That transparency is what makes a footprint defensible to stakeholders, not just internally useful.

The shift from secondary to primary data is not a one-time project. It is an ongoing part of managing carbon well, and the companies that do it systematically are the ones whose reported reductions reflect genuine change rather than methodology adjustments.

What is secondary emissions data?