Extracting logs from Logs Data Platform

Knowledge Base

Extracting logs from Logs Data Platform


Icons/System/eye-open Created with Sketch. 23 Views 16.10.2025 Cloud / Logs Data Platform

Objective

This guide explains how to export logs stored in the Logs Data Platform (LDP) using tooling that speaks the OpenSearch API. It presents two reference implementations:

  • Logstash, suited to long-running pipelines and complex transformations.
  • Elasticdump, a lightweight CLI utility designed for one-off or scheduled exports.

Exporting logs is a common need when you want to analyse data outside the LDP ecosystem or feed it to external BI tools. For long-term archiving, we provide another solution.

The next sections explain how to pull documents from your alias with Logstash or with the Elasticdump CLI, letting you choose the tool that best fits your operational model.

The final part explains how you can export your logs from your cold-stored archives.

Requirements

  • You are already sending logs on a stream you own — see the quick start tutorial.
  • You know the OpenSearch endpoint of your LDP cluster (https://<ldp-cluster>.logs.ovh.com:9200).
  • Your host can reach TCP port 9200 on the cluster endpoint over TLS.
  • You have credentials for the alias you want to export (basic authentication or IAM bearer token).
  • You can install either Logstash ≥ 8.0 or Elasticdump ≥ 6.0 on the host that will run the export.

Instructions

Alias naming conventions

When you create an OpenSearch alias that points to a Graylog stream, the only customizable part is the suffix after -a-.. e.g., ldp-ti-98765-a-your_suffix**.

The remainder of the name is generated by the platform:

IAM statusGenerated partFull alias example
IAM enabledthe service identifier (e.g. ldp-ti-98765)ldp-ti-98765-a-logs-export
IAM disabled (before IAM migration)your username (e.g. logs-ab-12345)logs-ab-12345-a-logs-export

The <suffix> part (here logs-export) is a free‑form string you choose to describe the purpose of the alias.

Create a stream alias

  1. Log in to the OVHcloud Control Panel and go to the Identity, Security & Operations section.
  2. Click on Logs Data Platform under Operations then click on the desired account.
  3. Select the Alias tab and click Add an alias.
  4. Choose a suffix, add a description and save the alias.
  5. Click the menu on the right of the newly created alias and select Attach content to the alias.
  6. Select the Graylog stream(s) you want to export and confirm.

The alias now points to the underlying OpenSearch alias that stores the logs of the chosen stream.

Export logs with Logstash

Prerequisites

RequirementDetails
Logstash ≥ 8.0Install Logstash on a host that can reach your LDP cluster.
Java 8 or 11Required by Logstash.
OpenSearch endpointhttps://<ldp‑cluster>.logs.ovh.com:9200
AuthenticationBasic user/password or IAM bearer token (see the IAM FAQ).
Alias nameThe alias created in the previous step (e.g. ldp-ti-98765-a-logs-export).

Install Logstash and the OpenSearch plugins

# Download Logstash (tarball example)
wget https://artifacts.elastic.co/downloads/logstash/logstash-8.8.2-linux-x86_64.tar.gz
tar -zxvf logstash-8.8.2-linux-x86_64.tar.gz
cd logstash-8.8.2

# Install the required plugins
bin/logstash-plugin install logstash-input-opensearch

Logstash pipeline

Create a file pipeline.conf (any location, e.g. config/pipeline.conf):

input {
  opensearch {
    # Note that only the hostname is put
    hosts => ["<ldp-cluster>.logs.ovh.com:9200"]
    # Use basic auth or IAM token with pat_jwt_<prefix> username
    user => "<username>"
    password => "<password>"
    index => "<alias-name>"
    schedule => "*/5 * * * *"          # (Optional) Run every 5 minutes
    query => '{"query":{"bool":{"must":[{"match_all":{}}],"filter":[{"range":{"timestamp":{"gte":"now-1d"}}}]}}}'
    scroll => "5m" # (Optional) The scroll stays open for 5m
    ssl => true
  }
}

filter {
  # (Optional) Convert the timestamp to a readable format
  date {
    match => ["timestamp", "ISO8601"]
    target => "timestamp"
  }
}

output {
  csv {
    path => "/var/log/ldp/export-%{+YYYY-MM-dd}.csv"
    fields => ["timestamp", "host", "log.level", "message"]
    gzip => false # (Optional) Compress the output file.
    create_if_deleted => true # (Optional) Re‑create the file if it disappears while Logstash runs.
    flush_interval => 2 # (Optional) How often Logstash flushes data to disk.
  }
}

Run the pipeline

# From the Logstash root directory
bin/logstash -f config/pipeline.conf --config.reload.automatic

Logstash will connect to the OpenSearch endpoint, read the documents that belong to the alias defined above, and write the selected fields to a daily CSV file under /var/log/ldp/.

Export logs with Elasticdump

Prerequisites

RequirementDetails
Elasticdump ≥ 6.0Install on a host that can reach your LDP cluster over HTTPS.
Node.js runtimeElasticdump is a Node.js CLI; install Node.js 18 LTS or later.
Network accessAllow outbound TCP connectivity to <ldp-cluster>.logs.ovh.com on port 9200.
AuthenticationUse either basic auth credentials (legacy users) or an IAM bearer token.
TLS trust storeEnsure the system trust store contains public Certificate Authorities, or supply a CA bundle with --input-ca.
Alias nameThe alias created earlier (e.g. ldp-ti-98765-a-logs-export).

Install Elasticdump

Elasticdump is an open-source software designed to export data from ElasticSearch/OpenSearch to a file or to another OpenSearch/ElasticSearch. It's a great tool to migrate from ElasticSearch to OpenSearch and to download data contained in your aliases or indices. It is written in JavaScript and thus relies on a JavaScript runtime to run.

# Install Node.js using your preferred method (example for Debian/Ubuntu)
sudo apt update
sudo apt install nodejs npm

# Install elasticdump globally
npm install -g elasticdump

# Validate the installation
elasticdump --version

For air-gapped or containerised environments, you can download the official Docker image and run the same commands with docker run --rm -v "$PWD":/work -w /work elasticdump/elasticsearch-dump.

Authenticate to the Logs Data Platform

Elasticdump relies on HTTP headers for authentication. Choose the method that matches your account:

  • Basic authentication — append credentials in the URL: https://<username>:<password>@<ldp-cluster>.logs.ovh.com:9200/<alias>.
  • IAM bearer token — add an Authorization header: --input-headers '{"Authorization":"Bearer <iam-token>"}'.

When using IAM tokens, leave the credentials out of the URL and rely solely on the header. Tokens are short-lived; plan to refresh them before launching long exports.

Export an alias to JSON

Create a JSON file (search-body.json) that defines your query. The example below retrieves events from the last 24 hours:

{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "timestamp": {
              "gte": "now-24h",
              "lte": "now"
            }
          }
        }
      ]
    }
  },
  "sort": [
    {
      "timestamp": "asc"
    }
  ]
}

Run Elasticdump to export the documents:

elasticdump \
  --input https://<user>:<password>@<ldp-cluster>.logs.ovh.com:9200/<alias-name> \
  --output ./ldp-export.json \
  --searchBody @search-body.json \
  --limit 1000 \
  --type data

To authenticate with IAM, you can use the header flag or the hybrid authentication:

elasticdump \
  --input https://<ldp-cluster>.logs.ovh.com:9200/<alias-name> \
  --input-headers '{"Authorization":"Bearer <iam-token>"}' \
  --output ./ldp-export.json \
  --searchBody @search-body.json \
  --type data
elasticdump \
  --input https://pat_jwt_<any_string_here>:<iam-token>@<ldp-cluster>.logs.ovh.com:9200/<alias-name> \
  --output ./ldp-export.json \
  --searchBody @search-body.json \
  --type data

Elasticdump streams the results to ldp-export.json in newline-delimited JSON format, which can be loaded into analytics tools or archived for compliance.

$ elasticdump --input https://<user>:<password>@gra2.logs.ovh.com:9200/<alias-name> --output ./ldp-export.json --searchBody @search-body.json --limit 500 --type data
Tue, 07 Oct 2025 14:11:30 GMT | starting dump
Tue, 07 Oct 2025 14:11:30 GMT | got 79 objects from source elasticsearch (offset: 0)
Tue, 07 Oct 2025 14:11:30 GMT | sent 79 objects, 0 offset, to destination file, wrote 79
Tue, 07 Oct 2025 14:11:30 GMT | got 0 objects from source elasticsearch (offset: 500)
Tue, 07 Oct 2025 14:11:30 GMT | Total Writes: 79
Tue, 07 Oct 2025 14:11:30 GMT | dump complete

Handle formats, pagination and large time ranges

To export data in CSV format, use the csv scheme in the output file:

elasticdump \
  --input https://pat_jwt_<any_string_here>:<iam-token>@<ldp-cluster>.logs.ovh.com:9200/<alias-name> \
  --output csv://./ldp-export.json \
  --searchBody @search-body.json \
  --type data

Elasticdump paginates results automatically using the OpenSearch scroll API. Tune the export with the following options:

  • --limit <n> : controls how many documents Elasticdump pulls per batch. Reduce the value (e.g. 200) if you experience timeouts.
  • --maxSockets <n> : adjusts the number of concurrent HTTP connections. Set it to 1 for strict rate limiting or increase it to accelerate exports on aliases with high throughput.
  • --input-parameters '{"scroll":"10m"}' : extends the server-side cursor to 10 minutes, useful for large datasets.

To export specific time windows, modify search-body.json with a range filter and run several commands in sequence:

elasticdump --input https://<user>:<password>@<cluster>/<alias> \
  --output ./ldp-2024-05-01.json \
  --searchBody '{"query":{"range":{"timestamp":{"gte":"2024-05-01","lt":"2024-05-02"}}}}'

elasticdump --input https://<user>:<password>@<cluster>/<alias> \
  --output ./ldp-2024-05-02.json \
  --searchBody '{"query":{"range":{"timestamp":{"gte":"2024-05-02","lt":"2024-05-03"}}}}'

The --transform flag lets you adjust each document before writing it to disk. For example, to remove the _id field:

elasticdump --input https://<cluster>/<alias> \
  --output ./ldp.json \
  --searchBody @search-body.json \
  --transform 'delete doc._id; return doc;'

Export logs from archive

To export logs from your archives, we provide a tool to download them: ldp-archive-mirror. This software requires Python ≥ 3.6 to run.

First, install ldp-archive-mirror using pip:

$ pip3 install -U ldp-archive-mirror

Then you can use the binary ldp-mirror:

usage: ldp-mirror [-h] [--app-key KEY] [--app-secret SECRET]
              [--consumer-key KEY] [--ovh-region REGION] [--db DIR]
              [--mirror DIR] [--ldp-host HOST] [--ldp-token TOKEN]
              [--chunk-size CHUNK] [--gpg-passphrase SECRET]
              STREAM_ID [STREAM_ID ...]

LDP archive Mirror CLI - 0.2.0

positional arguments:
  STREAM_ID            LDP Stream UUIDs

optional arguments:
  -h, --help              Show this help message and exit
  --app-key KEY           OVH application key (default: dcd57be8c9dc53ff)
  --app-secret SECRET     OVH application secret (default: d37f35c27e60be58746e81e3351a84db)
  --consumer-key SECRET   OVH consumer key (default: 819fb70c64f91f797daf0ed3990e5ff0)
  --ovh-region REGION     OVH region (default: ovh-eu)
  --db DIR                Where to place the local sqlite database (default: /data/db)
  --mirror DIR            Where to place your archives (default: /data/mirror)
  --ldp-host HOST         If set, push logs of the current application to given LDP hostname
  --ldp-token TOKEN       If set, push logs of the current application to associated LDP stream token
  --chunk-size CHUNK      Download chunk size in bytes (default: 16384)
  --gpg-passphrase SECRET PGP private key passphrase (default: None)

Please go to the github page to setup this software and obtain the latest information on it.

Go further

For more details on the OpenSearch input plugin, see the official documentation. The CSV output plugin reference is available here. Documentation for Elasticdump is available here.

Related articles