This document is a lab diary covering the construction of a standalone Cyber Threat Intelligence engine built on an integrated MISP and OpenCTI stack running on Fedora 43 Workstation. The design philosophy, hardware requirements, system tuning, networking architecture, and ingestion pipeline are documented in full, including the specific failures and workarounds encountered during deployment.
1. The Philosophy: Why MISP + OpenCTI?
Effective cyber threat intelligence practice requires tools that complement each other rather than overlap unnecessarily. This stack divides labor into two distinct operational layers: The Archive and The Brain.
MISP: The Technical Archive (The “What”)
MISP (Malware Information Sharing Platform) serves as the foundational data lake.
- Raw Collector: MISP excels at ingesting massive, unstructured, or semi-structured OSINT feeds — millions of IP addresses, file hashes, and domains.
- The Buffer: While OpenCTI can ingest raw data, flooding its graph database with millions of unvetted, context-less indicators degrades performance severely. MISP holds the raw technical evidence until it is needed and has been filtered.
OpenCTI: The Analytical Processor (The “Who”, “How”, and “Why”)
OpenCTI is a Knowledge Management Platform built strictly on the STIX 2.1 framework. It is the engine that transforms raw data into structured, relational intelligence — the kind of structured output that frameworks like MITRE ATT&CK are designed to consume.
- Correlation Engine: It takes a raw hash from MISP and links it to a living campaign, mapping it to a Malware family, which connects to an Intrusion Set (e.g., APT41), which connects to specific ATT&CK techniques.
- Graph Visualization: It provides the visual Knowledge Graph necessary to rapidly understand the scope and relationships within a threat.
2. Architecture and Data Flow
The following diagram describes how data moves through the Nethound CTI lab, from external feed sources through to the OpenCTI analytical platform.
graph TD subgraph External Feeds["OSINT Feeds<br/>Abuse.ch, CIRCL"] APIs["External APIs<br/>VirusTotal, AlienVault"] end subgraph MISP Stack MISPCore[MISP Core] MISPDB[(MariaDB)] end subgraph OpenCTI Stack MISPConn[OpenCTI MISP Connector] NativeConn["Native Connectors<br/>CISA, AbuseIPDB"] Workers[Python Workers x15] RabbitMQ[RabbitMQ] Elastic[(Elasticsearch)] MinIO[(MinIO)] Redis[(Redis)] Platform[OpenCTI Platform] end Feeds -->|Ingest| MISPCore MISPCore <--> MISPDB MISPCore -->|Docker Bridge| MISPConn APIs --> NativeConn MISPConn -->|STIX 2.1| RabbitMQ NativeConn -->|STIX 2.1| RabbitMQ RabbitMQ <--> Workers Workers --> Elastic Workers --> MinIO Workers --> Redis Platform <--> Elastic Platform <--> Redis
MariaDB is used exclusively by MISP. OpenCTI relies on Elasticsearch for its primary database, Redis for state management, and MinIO for file storage.
3. Hardware Tiering: Scaling the Lab
CTI stacks — specifically Elasticsearch and Python worker nodes — are notoriously resource-hungry. The table below defines three operational tiers.
| Tier | Hardware | Storage | Target Use Case and Capability |
|---|---|---|---|
| Small | 4 Cores, 16 GB RAM | 100 GB SSD | The Learner: Limited to 1–2 curated OSINT feeds. Dashboard may feel sluggish during active ingest. |
| Medium | 8 Cores, 32 GB RAM | 500 GB NVMe | The Standard Researcher: 5–10 active feeds. Capable of handling a few months of historical data with smooth daily operation. |
| Nethound | 32 Cores (Ryzen 9), 92 GB RAM* | 11 TB Btrfs Array | The Powerhouse: Full historical ingestion. 15–20 parallel workers. Capable of processing 250,000+ STIX bundles in hours. |
The system carries 96 GB of physical memory; a portion is hardware-reserved by the integrated Radeon graphics on the Ryzen 9, yielding 92 GB available to the OS.
For context on the broader infrastructure this stack sits within, see the Nethound hypervisor build documentation.
4. The Foundation: Docker on Fedora 43
Fedora ships with Podman by default. While Podman is a capable container runtime, the complex inter-container networking required by OpenCTI and MISP is significantly easier to manage using Docker-CE with the Compose plugin.
# Remove conflicting packages
sudo dnf remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
# Add the official Docker repository
sudo dnf -y install dnf-plugins-core
sudo dnf config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo
# Install Docker Engine and Compose
sudo dnf install docker-ce \
docker-ce-cli \
containerd.io \
docker-buildx-plugin \
docker-compose-plugin
# Start and enable the service
sudo systemctl enable --now docker
# Add your user to the docker group
sudo usermod -aG docker $USERLog out and back in to apply the group membership change before proceeding.
5. Fedora 43 Workstation Tuning and Storage
Fedora provides a cutting-edge Linux kernel, but default OS limits are too restrictive for big-data applications. Both Elasticsearch and a 30+ container environment require explicit kernel and filesystem tuning before deployment.
Kernel and Memory Tuning
Elasticsearch uses mmap to map index files into memory. If vm.max_map_count is left at its default, the OpenCTI database will crash under heavy ingestion. The file descriptor limit also requires a substantial increase for large container deployments.
Create /etc/sysctl.d/99-cti-lab.conf with the following content:
# Required by OpenCTI for high-ingestion Elasticsearch environments
vm.max_map_count=1048575
# Increase file descriptor limits for massive concurrent networking
fs.file-max=2097152Apply immediately without a reboot:
sudo sysctl --systemStorage and Persistence: The 11TB Strategy
A standard root partition will fill within weeks under active CTI feed ingestion. This deployment externalizes Docker’s data root to a dedicated 11 TB Btrfs volume mounted at /mnt/cti-storage.
Modify /etc/docker/daemon.json:
{
"data-root": "/mnt/cti-storage/docker-data",
"storage-driver": "btrfs"
}Restart the Docker daemon to apply:
sudo systemctl restart dockerBackup Strategy: Because the storage layer is Btrfs, native Btrfs snapshots provide zero-downtime backups of the entire /mnt/cti-storage/docker-data subvolume. This is substantially more reliable than attempting to tar active database volumes.
6. Modular Networking Design
Placing 30+ containers into a single docker-compose.yml creates an unmanageable monolith with a single fault domain. The preferred approach uses external bridge networking to maintain clean separation between the MISP and OpenCTI stacks while allowing targeted inter-stack communication.
Implementation (Option 3: External Bridge Networking)
-
Create the shared bridge network:
docker network create misp-docker_default -
Declare the network as external in the OpenCTI compose file:
In
docker-compose.ymlfor the OpenCTI stack, define the network reference at the bottom of the file:networks: misp-docker_default: external: trueAttach the MISP Connector service to this network so it can reach the MISP Core container.
-
Configure internal DNS resolution:
In the OpenCTI
.envfile, point the connector at the MISP container by its internal Docker DNS name:MISP_URL=https://misp-docker-misp-core-1
Traffic between the OpenCTI connector and the MISP API routes entirely through Docker’s internal DNS, ensuring fault domain isolation between the two stacks. A failure in the MISP stack does not cascade into the OpenCTI stack.
7. High-Performance Ingestion: OpenCTI Tuning
Initial linkage of MISP to OpenCTI triggered an ingestion queue exceeding 250,000 items. The following tuning steps were required to process this volume without queue saturation or system instability.
1. Scaling the Database (Elasticsearch)
Elasticsearch requires sufficient heap memory to hold indexes open during massive write operations. Set this in the OpenCTI .env file:
ELASTIC_MEMORY_SIZE=16G2. Scaling the Workers
OpenCTI offloads all bundle processing to Python workers. A single worker processes one STIX bundle at a time. With 32 CPU cores available, workers were scaled to 15 replicas in docker-compose.yml:
worker:
image: opencti/worker:latest
deploy:
mode: replicated
replicas: 153. Monitoring Worker Exhaustion
System-level metrics are monitored using btm (bottom). The RabbitMQ queue depth is checked directly to confirm workers are consuming the backlog:
docker exec xtm-rabbitmq-1 rabbitmq-diagnostics -q list_queues name messages_readyA steadily decreasing messages_ready count confirms the workers are processing bundles at the expected rate.
8. Ingestion: Feeds and Connectors
The lab employs a dual-ingestion strategy that separates bulk raw feed ingest from structured enrichment and contextualization. This mirrors the multi-source collection discipline used in professional CTI operations.
The MISP Pipeline (Raw Threat Feeds)
The following community feeds are ingested through MISP:
- Abuse.ch (URLhaus, MalwareBazaar, ThreatFox): High-fidelity, verified malware indicators.
- CIRCL OSINT Feed: Broad, contextualized threat intelligence from the Computer Incident Response Center Luxembourg.
- DigitalSide-IT: Active lists of malware hashes and C2 infrastructure IPs.
Controlling Data Gravity:
The OpenCTI MISP Connector’s historical pull is bounded to prevent overwhelming Elasticsearch with irrelevant historical data. Set in the connector’s .env:
MISP_IMPORT_FROM_DATE=2024-01-01
MISP_INTERVAL=60
MISP_INTERVALis in minutes. A 60-minute polling interval provides a practical balance between data freshness and system load for a lab environment.
OpenCTI Native Connectors (Enrichment and Context)
The following native OpenCTI connectors are defined in the compose file to add relational context to indicators flowing in from MISP:
- CISA KEV: Imports Known Exploited Vulnerabilities directly into the knowledge graph, enabling correlation against the current threat landscape.
- AlienVault OTX and VirusTotal: Used for enriching hashes and IPs originating from MISP feeds.
- AbuseIPDB: Provides confidence scoring for incoming network observables.
9. Verification and First Look
The following steps confirm the stack is operating correctly end-to-end.
- Check the MISP Connector Status: Navigate to Settings → Connectors in OpenCTI. The MISP connector must display as “Active.”
- Watch the RabbitMQ Queue Drain: Execute the
rabbitmq-diagnosticscommand from Section 7. Themessages_readyvalue should decrease steadily as the 15 workers process bundles. - Confirm Indicator Ingest: Navigate to Observations → Indicators in OpenCTI. Raw IPs and hashes tagged from
Abuse.chshould appear automatically within the first polling cycle. - Validate the Knowledge Graph: Click on a malware family (e.g., Cobalt Strike). A successful integration will render a visual map linking the malware object to specific indicators sourced from MISP and enriched by the AlienVault and VirusTotal connectors.
10. Troubleshooting and Edge Cases
The MISP_SALT Reality Check
Despite correct .env configuration, MISP threw errors regarding a missing cryptographic salt in config.php during initial startup. The salt was injected manually using a PHP one-liner executed inside the container:
docker exec misp-docker-misp-core-1 php -r "\
\$file = '/var/www/MISP/app/Config/config.php';\
\$content = file_get_contents(\$file);\
\$content = str_replace(\"'salt' => ''\", \"'salt' => '\" . bin2hex(random_bytes(32)) . \"'\", \$content);\
file_put_contents(\$file, \$content);\
echo 'Salt updated successfully' . PHP_EOL;\
"After patching the config, the Redis cache was flushed and the MISP Core container restarted to pick up the change:
docker exec misp-docker-redis-1 redis-cli FLUSHALL
docker restart misp-docker-misp-core-1SSL/TLS Justification: MISP_SSL_VERIFY=false
MISP generates a self-signed certificate on startup. In the OpenCTI MISP connector .env, SSL verification is explicitly bypassed:
MISP_SSL_VERIFY=falseThis is an intentional and justified configuration for this deployment. Because the modular external bridge network is used, traffic between the OpenCTI connector and the MISP API never leaves the Docker virtual network. Bypassing verification in a fully internal, non-routable network eliminates the administrative overhead of managing a local CA for zero tangible security gain. This decision should be revisited if MISP is ever exposed outside the Docker network boundary. For a broader discussion of trust boundary controls, see operational security fundamentals.
Long-Term Database Pruning
Without pruning, the MISP database will grow without bound. Navigate to MISP → Administration → Server Settings → MISP and enable the pruning feature. Configure it to purge attributes from low-confidence categories after 90 days. This keeps the 11 TB array healthy over long operational periods.
