The Nethound CTI Lab: Building a High-Performance MISP and OpenCTI Intelligence Engine

This document is a lab diary covering the construction of a standalone Cyber Threat Intelligence engine built on an integrated MISP and OpenCTI stack running on Fedora 43 Workstation. The design philosophy, hardware requirements, system tuning, networking architecture, and ingestion pipeline are documented in full, including the specific failures and workarounds encountered during deployment.

1. The Philosophy: Why MISP + OpenCTI?

Effective cyber threat intelligence practice requires tools that complement each other rather than overlap unnecessarily. This stack divides labor into two distinct operational layers: The Archive and The Brain.

MISP: The Technical Archive (The “What”)

MISP (Malware Information Sharing Platform) serves as the foundational data lake.

Raw Collector: MISP excels at ingesting massive, unstructured, or semi-structured OSINT feeds — millions of IP addresses, file hashes, and domains.
The Buffer: While OpenCTI can ingest raw data, flooding its graph database with millions of unvetted, context-less indicators degrades performance severely. MISP holds the raw technical evidence until it is needed and has been filtered.

OpenCTI: The Analytical Processor (The “Who”, “How”, and “Why”)

OpenCTI is a Knowledge Management Platform built strictly on the STIX 2.1 framework. It is the engine that transforms raw data into structured, relational intelligence — the kind of structured output that frameworks like MITRE ATT&CK are designed to consume.

Correlation Engine: It takes a raw hash from MISP and links it to a living campaign, mapping it to a Malware family, which connects to an Intrusion Set (e.g., APT41), which connects to specific ATT&CK techniques.
Graph Visualization: It provides the visual Knowledge Graph necessary to rapidly understand the scope and relationships within a threat.

2. Architecture and Data Flow

The following diagram describes how data moves through the Nethound CTI lab, from external feed sources through to the OpenCTI analytical platform.

graph TD
    subgraph External
        Feeds["OSINT Feeds<br/>Abuse.ch, CIRCL"]
        APIs["External APIs<br/>VirusTotal, AlienVault"]
    end

    subgraph MISP Stack
        MISPCore[MISP Core]
        MISPDB[(MariaDB)]
    end

    subgraph OpenCTI Stack
        MISPConn[OpenCTI MISP Connector]
        NativeConn["Native Connectors<br/>CISA, AbuseIPDB"]
        Workers[Python Workers x15]
        RabbitMQ[RabbitMQ]
        Elastic[(Elasticsearch)]
        MinIO[(MinIO)]
        Redis[(Redis)]
        Platform[OpenCTI Platform]
    end

    Feeds -->|Ingest| MISPCore
    MISPCore <--> MISPDB

    MISPCore -->|Docker Bridge| MISPConn
    APIs --> NativeConn

    MISPConn -->|STIX 2.1| RabbitMQ
    NativeConn -->|STIX 2.1| RabbitMQ

    RabbitMQ <--> Workers
    Workers --> Elastic
    Workers --> MinIO
    Workers --> Redis

    Platform <--> Elastic
    Platform <--> Redis

MariaDB is used exclusively by MISP. OpenCTI relies on Elasticsearch for its primary database, Redis for state management, and MinIO for file storage.

3. Hardware Tiering: Scaling the Lab

CTI stacks — specifically Elasticsearch and Python worker nodes — are notoriously resource-hungry. The table below defines three operational tiers.

Tier	Hardware	Storage	Target Use Case and Capability
Small	4 Cores, 16 GB RAM	100 GB SSD	The Learner: Limited to 1–2 curated OSINT feeds. Dashboard may feel sluggish during active ingest.
Medium	8 Cores, 32 GB RAM	500 GB NVMe	The Standard Researcher: 5–10 active feeds. Capable of handling a few months of historical data with smooth daily operation.
Nethound	32 Cores (Ryzen 9), 92 GB RAM*	11 TB Btrfs Array	The Powerhouse: Full historical ingestion. 15–20 parallel workers. Capable of processing 250,000+ STIX bundles in hours.

The system carries 96 GB of physical memory; a portion is hardware-reserved by the integrated Radeon graphics on the Ryzen 9, yielding 92 GB available to the OS.

For context on the broader infrastructure this stack sits within, see the Nethound hypervisor build documentation.

4. The Foundation: Docker on Fedora 43

Fedora ships with Podman by default. While Podman is a capable container runtime, the complex inter-container networking required by OpenCTI and MISP is significantly easier to manage using Docker-CE with the Compose plugin.

# Remove conflicting packages
sudo dnf remove docker \
  docker-client \
  docker-client-latest \
  docker-common \
  docker-latest \
  docker-latest-logrotate \
  docker-logrotate \
  docker-selinux \
  docker-engine-selinux \
  docker-engine
 
# Add the official Docker repository
sudo dnf -y install dnf-plugins-core
sudo dnf config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo
 
# Install Docker Engine and Compose
sudo dnf install docker-ce \
  docker-ce-cli \
  containerd.io \
  docker-buildx-plugin \
  docker-compose-plugin
 
# Start and enable the service
sudo systemctl enable --now docker
 
# Add your user to the docker group
sudo usermod -aG docker $USER

Log out and back in to apply the group membership change before proceeding.

5. Fedora 43 Workstation Tuning and Storage

Fedora provides a cutting-edge Linux kernel, but default OS limits are too restrictive for big-data applications. Both Elasticsearch and a 30+ container environment require explicit kernel and filesystem tuning before deployment.

Kernel and Memory Tuning

Elasticsearch uses mmap to map index files into memory. If vm.max_map_count is left at its default, the OpenCTI database will crash under heavy ingestion. The file descriptor limit also requires a substantial increase for large container deployments.

Create /etc/sysctl.d/99-cti-lab.conf with the following content:

# Required by OpenCTI for high-ingestion Elasticsearch environments
vm.max_map_count=1048575
 
# Increase file descriptor limits for massive concurrent networking
fs.file-max=2097152

Apply immediately without a reboot:

sudo sysctl --system

Storage and Persistence: The 11TB Strategy

A standard root partition will fill within weeks under active CTI feed ingestion. This deployment externalizes Docker’s data root to a dedicated 11 TB Btrfs volume mounted at /mnt/cti-storage.

Modify /etc/docker/daemon.json:

{
  "data-root": "/mnt/cti-storage/docker-data",
  "storage-driver": "btrfs"
}

Restart the Docker daemon to apply:

sudo systemctl restart docker

Backup Strategy: Because the storage layer is Btrfs, native Btrfs snapshots provide zero-downtime backups of the entire /mnt/cti-storage/docker-data subvolume. This is substantially more reliable than attempting to tar active database volumes.

6. Modular Networking Design

Placing 30+ containers into a single docker-compose.yml creates an unmanageable monolith with a single fault domain. The preferred approach uses external bridge networking to maintain clean separation between the MISP and OpenCTI stacks while allowing targeted inter-stack communication.

Implementation (Option 3: External Bridge Networking)

Create the shared bridge network:

docker network create misp-docker_default

Declare the network as external in the OpenCTI compose file:

In docker-compose.yml for the OpenCTI stack, define the network reference at the bottom of the file:
```
networks:
  misp-docker_default:
    external: true
```
Attach the MISP Connector service to this network so it can reach the MISP Core container.
Configure internal DNS resolution:

In the OpenCTI .env file, point the connector at the MISP container by its internal Docker DNS name:
```
MISP_URL=https://misp-docker-misp-core-1
```

Traffic between the OpenCTI connector and the MISP API routes entirely through Docker’s internal DNS, ensuring fault domain isolation between the two stacks. A failure in the MISP stack does not cascade into the OpenCTI stack.

7. High-Performance Ingestion: OpenCTI Tuning

Initial linkage of MISP to OpenCTI triggered an ingestion queue exceeding 250,000 items. The following tuning steps were required to process this volume without queue saturation or system instability.

1. Scaling the Database (Elasticsearch)

Elasticsearch requires sufficient heap memory to hold indexes open during massive write operations. Set this in the OpenCTI .env file:

ELASTIC_MEMORY_SIZE=16G

2. Scaling the Workers

OpenCTI offloads all bundle processing to Python workers. A single worker processes one STIX bundle at a time. With 32 CPU cores available, workers were scaled to 15 replicas in docker-compose.yml:

  worker:
    image: opencti/worker:latest
    deploy:
      mode: replicated
      replicas: 15

3. Monitoring Worker Exhaustion

System-level metrics are monitored using btm (bottom). The RabbitMQ queue depth is checked directly to confirm workers are consuming the backlog:

docker exec xtm-rabbitmq-1 rabbitmq-diagnostics -q list_queues name messages_ready

A steadily decreasing messages_ready count confirms the workers are processing bundles at the expected rate.

8. Ingestion: Feeds and Connectors

The lab employs a dual-ingestion strategy that separates bulk raw feed ingest from structured enrichment and contextualization. This mirrors the multi-source collection discipline used in professional CTI operations.

The MISP Pipeline (Raw Threat Feeds)

The following community feeds are ingested through MISP:

Abuse.ch (URLhaus, MalwareBazaar, ThreatFox): High-fidelity, verified malware indicators.
CIRCL OSINT Feed: Broad, contextualized threat intelligence from the Computer Incident Response Center Luxembourg.
DigitalSide-IT: Active lists of malware hashes and C2 infrastructure IPs.

Controlling Data Gravity:

The OpenCTI MISP Connector’s historical pull is bounded to prevent overwhelming Elasticsearch with irrelevant historical data. Set in the connector’s .env:

MISP_IMPORT_FROM_DATE=2024-01-01
MISP_INTERVAL=60

MISP_INTERVAL is in minutes. A 60-minute polling interval provides a practical balance between data freshness and system load for a lab environment.

OpenCTI Native Connectors (Enrichment and Context)

The following native OpenCTI connectors are defined in the compose file to add relational context to indicators flowing in from MISP:

CISA KEV: Imports Known Exploited Vulnerabilities directly into the knowledge graph, enabling correlation against the current threat landscape.
AlienVault OTX and VirusTotal: Used for enriching hashes and IPs originating from MISP feeds.
AbuseIPDB: Provides confidence scoring for incoming network observables.

9. Verification and First Look

The following steps confirm the stack is operating correctly end-to-end.

Check the MISP Connector Status: Navigate to Settings → Connectors in OpenCTI. The MISP connector must display as “Active.”
Watch the RabbitMQ Queue Drain: Execute the rabbitmq-diagnostics command from Section 7. The messages_ready value should decrease steadily as the 15 workers process bundles.
Confirm Indicator Ingest: Navigate to Observations → Indicators in OpenCTI. Raw IPs and hashes tagged from Abuse.ch should appear automatically within the first polling cycle.
Validate the Knowledge Graph: Click on a malware family (e.g., Cobalt Strike). A successful integration will render a visual map linking the malware object to specific indicators sourced from MISP and enriched by the AlienVault and VirusTotal connectors.

10. Troubleshooting and Edge Cases

The `MISP_SALT` Reality Check

Despite correct .env configuration, MISP threw errors regarding a missing cryptographic salt in config.php during initial startup. The salt was injected manually using a PHP one-liner executed inside the container:

docker exec misp-docker-misp-core-1 php -r "\
\$file = '/var/www/MISP/app/Config/config.php';\
\$content = file_get_contents(\$file);\
\$content = str_replace(\"'salt' => ''\", \"'salt' => '\" . bin2hex(random_bytes(32)) . \"'\", \$content);\
file_put_contents(\$file, \$content);\
echo 'Salt updated successfully' . PHP_EOL;\
"

After patching the config, the Redis cache was flushed and the MISP Core container restarted to pick up the change:

docker exec misp-docker-redis-1 redis-cli FLUSHALL
docker restart misp-docker-misp-core-1

SSL/TLS Justification: `MISP_SSL_VERIFY=false`

MISP generates a self-signed certificate on startup. In the OpenCTI MISP connector .env, SSL verification is explicitly bypassed:

MISP_SSL_VERIFY=false

This is an intentional and justified configuration for this deployment. Because the modular external bridge network is used, traffic between the OpenCTI connector and the MISP API never leaves the Docker virtual network. Bypassing verification in a fully internal, non-routable network eliminates the administrative overhead of managing a local CA for zero tangible security gain. This decision should be revisited if MISP is ever exposed outside the Docker network boundary. For a broader discussion of trust boundary controls, see operational security fundamentals.

Long-Term Database Pruning

Without pruning, the MISP database will grow without bound. Navigate to MISP → Administration → Server Settings → MISP and enable the pruning feature. Configure it to purge attributes from low-confidence categories after 90 days. This keeps the 11 TB array healthy over long operational periods.

Nethound

Explorer

The Nethound CTI Lab: Building a High-Performance MISP and OpenCTI Intelligence Engine

1. The Philosophy: Why MISP + OpenCTI?

MISP: The Technical Archive (The “What”)

OpenCTI: The Analytical Processor (The “Who”, “How”, and “Why”)

2. Architecture and Data Flow

3. Hardware Tiering: Scaling the Lab

4. The Foundation: Docker on Fedora 43

5. Fedora 43 Workstation Tuning and Storage

Kernel and Memory Tuning

Storage and Persistence: The 11TB Strategy

6. Modular Networking Design

7. High-Performance Ingestion: OpenCTI Tuning

1. Scaling the Database (Elasticsearch)

2. Scaling the Workers

3. Monitoring Worker Exhaustion

8. Ingestion: Feeds and Connectors

The MISP Pipeline (Raw Threat Feeds)

OpenCTI Native Connectors (Enrichment and Context)

9. Verification and First Look

10. Troubleshooting and Edge Cases

The `MISP_SALT` Reality Check

SSL/TLS Justification: `MISP_SSL_VERIFY=false`

Long-Term Database Pruning

tags

Graph View

Table of Contents

Nethound

Explorer

The Nethound CTI Lab: Building a High-Performance MISP and OpenCTI Intelligence Engine

1. The Philosophy: Why MISP + OpenCTI?

MISP: The Technical Archive (The “What”)

OpenCTI: The Analytical Processor (The “Who”, “How”, and “Why”)

2. Architecture and Data Flow

3. Hardware Tiering: Scaling the Lab

4. The Foundation: Docker on Fedora 43

5. Fedora 43 Workstation Tuning and Storage

Kernel and Memory Tuning

Storage and Persistence: The 11TB Strategy

6. Modular Networking Design

7. High-Performance Ingestion: OpenCTI Tuning

1. Scaling the Database (Elasticsearch)

2. Scaling the Workers

3. Monitoring Worker Exhaustion

8. Ingestion: Feeds and Connectors

The MISP Pipeline (Raw Threat Feeds)

OpenCTI Native Connectors (Enrichment and Context)

9. Verification and First Look

10. Troubleshooting and Edge Cases

The MISP_SALT Reality Check

SSL/TLS Justification: MISP_SSL_VERIFY=false

Long-Term Database Pruning

tags

Graph View

Table of Contents

The `MISP_SALT` Reality Check

SSL/TLS Justification: `MISP_SSL_VERIFY=false`