Hashtag Web3 Logo

Critical Security Mistakes Blockchain Engineers Make

A single bug in blockchain infrastructure can compromise an entire network. These mistakes have caused chain splits, fund losses, and network outages. Learn from them.

For: blockchain engineerUpdated: March 13, 2026

Consensus Mistakes

Errors in consensus implementation.

critical

Using operations that produce different results on different machines: floating point, unordered maps, system time.

What happens: Different nodes compute different state. Chain split. Network partition.

Fix: Audit all code paths for determinism. No floats, ordered data structures only, use block time.

critical

Fork choice rule that does not match specification or has edge case bugs.

What happens: Nodes follow different chains. Network cannot reach consensus.

Fix: Extensive testing against spec test vectors. Fuzzing fork choice code.

critical

Marking blocks as final when they are not, or not finalizing when they should be.

What happens: Reorging 'final' blocks breaks applications. Not finalizing stalls the chain.

Fix: Follow finality specs exactly. Test all finality edge cases.

critical

Not slashing validators who violate rules, or slashing innocent validators.

What happens: Attacks go unpunished or honest validators lose stake.

Fix: Implement slashing conditions exactly per spec. Extensive testing.

P2P Network Mistakes

Peer-to-peer networking vulnerabilities.

critical

Accepting blocks, transactions, or state from peers without cryptographic verification.

What happens: Malicious peers can corrupt node state. False blocks accepted.

Fix: Verify all signatures and proofs. Never trust, always verify.

critical

Node can be surrounded by attacker peers, isolated from honest network.

What happens: Attacker controls node's view of the network. Double-spend possible.

Fix: Diverse peer selection. Monitor peer behavior. Detect isolation.

high

Not rate limiting messages, not validating before heavy processing.

What happens: Network brought down by spam. Nodes crash or fall behind.

Fix: Validate cheaply first. Rate limit per peer. Disconnect bad actors.

high

Small request triggers large response, enabling DDoS amplification.

What happens: Attackers use nodes to amplify attacks against other targets.

Fix: Limit response sizes. Authenticate requesters when appropriate.

Cryptographic Mistakes

Errors in cryptographic implementation.

critical

Implementing cryptographic primitives instead of using audited libraries.

What happens: Implementation bugs lead to key leakage, forgery, or breaks.

Fix: Use established, audited libraries. Have crypto code reviewed.

critical

Operations involving secrets that take variable time based on secret values.

What happens: Timing attacks leak private keys.

Fix: Use constant-time comparison and arithmetic for all secret data.

critical

Using non-cryptographic RNG for key generation or protocol randomness.

What happens: Predictable keys or protocol values. Complete security break.

Fix: Always use cryptographically secure RNG (OS-provided or audited library).

critical

Accidentally logging private keys or including them in error messages.

What happens: Private keys leaked to log files or monitoring systems.

Fix: Never format private keys as strings. Audit all logging paths.

State Management Mistakes

Errors in state storage and sync.

critical

Different clients computing different state roots for same transactions.

What happens: Chain split. Clients cannot agree on state.

Fix: Exhaustive testing. Use execution spec tests. Fuzzing.

high

Not handling database errors, crashes, or corruption properly.

What happens: Data corruption. Node unable to sync or stuck.

Fix: Use atomic operations. Implement recovery procedures.

high

Removing historical data that is still needed for reorgs or validation.

What happens: Node cannot handle reorgs. Unable to provide historical data.

Fix: Conservative pruning. Keep sufficient history for max reorg.

critical

Downloading state from peers and using it without checking proofs.

What happens: Malicious peer provides false state. Node operates on corrupted data.

Fix: Verify state proofs against block headers from trusted source.

Operations Mistakes

Errors in node operations and deployment.

critical

Admin APIs (send transactions, control node) accessible without auth.

What happens: Anyone can control the node. Funds stolen.

Fix: Firewall admin APIs. Require authentication. Localhost only.

high

Storing validator signing keys on internet-connected machine.

What happens: Keys stolen if machine compromised. Slashing possible.

Fix: Use remote signer or HSM. Air-gapped key storage.

high

Running nodes without visibility into their health or behavior.

What happens: Problems not detected until too late. Missed attestations.

Fix: Prometheus metrics. Grafana dashboards. PagerDuty alerts.

high

Upgrading nodes without testing the procedure in staging.

What happens: Botched upgrade takes nodes offline. Missed rewards or slashing.

Fix: Test upgrades on testnet. Have rollback plan. Monitor closely.

Protocol Implementation Mistakes

Errors when implementing protocol specifications.

critical

Ambiguous spec language interpreted differently between clients.

What happens: Client incompatibility. Network splits during edge cases.

Fix: Coordinate with other teams. Use reference tests. Fuzzing against other clients.

high

No way to identify or negotiate protocol version between peers.

What happens: Incompatible clients cannot communicate. Hard fork coordination issues.

Fix: Version all protocol messages. Implement negotiation handshake.

critical

Encoding/decoding that differs from specification or other clients.

What happens: Messages misinterpreted. Invalid blocks created or rejected incorrectly.

Fix: Use spec-compliant serialization. Test against cross-client fixtures.

critical

Gas costs for operations that differ from specification.

What happens: Different execution results. Out-of-gas at different points.

Fix: Follow EIP gas costs exactly. Benchmark actual costs vs charged.

high

Not validating that values fit within expected ranges.

What happens: Overflow, underflow, or crashes on malformed input.

Fix: Check all input ranges. Use bounded integer types.

Memory and Resource Mistakes

Resource management vulnerabilities.

high

Allocating memory based on peer-provided sizes without limits.

What happens: Memory exhaustion. Node crash. Denial of service.

Fix: Cap all allocations. Validate sizes before allocating.

medium

Not cleaning up connections, file handles, or memory.

What happens: Slow degradation. Eventually node fails or slows.

Fix: Use RAII patterns. Monitor resource usage. Regular restarts if needed.

high

Long-running operations blocking critical network handling.

What happens: Node appears offline. Misses messages. Falls behind.

Fix: Async everywhere. Move heavy work to background threads.

medium

No cap on peer connections allowing resource exhaustion.

What happens: Too many connections consume resources. Degrades performance.

Fix: Set reasonable connection limits. Quality-based peer selection.

high

Allowing attackers to fill caches with useless data.

What happens: Legitimate data evicted from cache. Performance drops.

Fix: Cache admission policies. Per-peer cache quotas.

Testing and Validation Mistakes

Quality assurance errors.

high

Not using fuzz testing on parsers and state transitions.

What happens: Edge case bugs not found until exploited in production.

Fix: Fuzz all input parsing. Fuzz state transitions. Coverage-guided fuzzing.

high

Not testing against other implementations before release.

What happens: Incompatibilities discovered on mainnet.

Fix: Sync against all major clients on testnet. Use Hive testing.

medium

Not adding tests for previously found bugs.

What happens: Old bugs reappear after refactoring.

Fix: Add regression test for every bug fix. Never delete tests.

More for blockchain engineer