> ## Documentation Index
> Fetch the complete documentation index at: https://quartr.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Backlog transcripts

> Transcripts from events

<img
  src="https://mintcdn.com/quartr/uS9un6ABvG4dvudU/images/earnings_call_transcripts.webp?fit=max&auto=format&n=uS9un6ABvG4dvudU&q=85&s=6a186be3988a2d71e153c9a086ce38b5"
  alt="Descriptive alt text"
  noZoom
  style={{
width: '100%',
height: 'auto',
borderRadius: '8px',
marginBottom: '1rem',
}}
  width="1920"
  height="1080"
  data-path="images/earnings_call_transcripts.webp"
/>

## Overview

The transcript dataset provides access to high-accuracy transcripts from regular earnings calls and unique events. Each transcript is associated with an event.

<Info>
  Transcripts are only available for events conducted in English.
</Info>

## How it works

Each transcript record in the API response includes a URL pointing to a JSON file hosted on our CDN. Transcripts are released in two steps:

1. **Raw transcript** (type ID = 15) is published shortly after the event concludes, with paragraph breaks.
2. **Edited transcript** (type ID = 22) follows once speaker identification has been completed, adding speaker names, roles, and company affiliations.

## Chapters

Transcripts can be divided into structured segments using [chapters](/datasets/chapters). Use the [chapters endpoint](/api-reference/transcripts/list-transcript-chapters) to retrieve hierarchical sections with titles and timestamps for a given transcript.

## Speaker identification

The edited transcript (type ID = 22) maps each paragraph to a named speaker with their role and company. The JSON structure is the same as the raw transcript, with an added `speaker_mapping` array at the top level. Older transcripts will not be retroactively updated with speaker data.

If your application filters by `typeId`, include both: `typeIds=15,22`.

<AccordionGroup>
  <Accordion title="Are speaker fields nullable?">
    It’s not always possible to identify who is speaking. In such cases, the `name`, `role`, and `company` fields may be null. This can happen when:

    * The speaker is not clearly identified in the audio.
    * The speaker is not part of the event’s official roster.
    * We are unable to verify the speaker’s identity or role.
  </Accordion>

  <Accordion title="When is speaker identification made available?">
    During high-activity periods like earnings season, some events may be prioritized for speaker attribution based on client interest and market relevance. Speaker data may appear on certain transcripts sooner than others.
  </Accordion>
</AccordionGroup>

## Data structure

Transcripts are structured as a hierarchy: the top-level `transcript` object contains the full text and an array of `paragraphs`, each paragraph contains `sentences`, and each sentence contains `words`. Every level includes start/end timestamps in seconds and a zero-based `speaker` index.

<CodeGroup>
  ```json Standard theme={null}
  {
    "version": "1.0",
    "event_id": 123456,
    "company_id": 123,
    "transcript": {
      "text": "This is the full transcript text",
      "number_of_speakers": 3,
      "paragraphs": [
        {
          "text": "This is the paragraph text",
          "start": 0,
          "end": 10,
          "speaker": 0,
          "sentences": [
            {
              "text": "This is the sentence text",
              "start": 0,
              "end": 5,
              "words": [
                {
                  "word": "This",
                  "punctuated_word": "This",
                  "start": 0,
                  "end": 5,
                  "confidence": 0.9
                }
              ]
            }
          ]
        }
      ]
    }
  }
  ```

  ```json Edited theme={null}
  {
    "version": "1.0",
    "event_id": 123456,
    "company_id": 123,
    "speaker_mapping": [
      {
        "speaker": 0,
        "speaker_data": {
          "name": "Operator",
          "role": null,
          "company": null
        }
      },
      {
        "speaker": 1,
        "speaker_data": {
          "name": "John Doe",
          "role": "CEO",
          "company": "ACME Inc."
        }
      },
      {
        "speaker": 2,
        "speaker_data": {
          "name": null,
          "role": null,
          "company": null
        }
      }
    ],
    "transcript": {
      "text": "This is the full transcript text",
      "number_of_speakers": 3,
      "paragraphs": [
        {
          "text": "This is the paragraph text.",
          "start": 0,
          "end": 10,
          "speaker": 0,
          "sentences": [
            {
              "text": "This is the sentence text",
              "start": 0,
              "end": 5,
              "words": [
                {
                  "word": "This",
                  "punctuated_word": "This",
                  "start": 0,
                  "end": 5,
                  "confidence": 0.9
                }
              ]
            }
          ]
        }
      ]
    }
  }
  ```
</CodeGroup>

## How to access this data

<CardGroup cols={3}>
  <Card title="REST API" icon="code" href="../rest-api/fetching-data">
    Query audio files using company and event filters for full control.
  </Card>

  <Card title="Webhooks" icon="webhook" href="webhooks/getting-started">
    Subscribe to webhooks for real-time updates.
  </Card>

  <Card title="Snowflake" icon="snowflake" href="snowflake/getting-started">
    Query the transcripts view directly using SQL.
  </Card>
</CardGroup>
