TIL: Optimizing Rust Compile Times

Today I discovered comprehensive strategies for reducing Rust compilation times, which is crucial for maintaining developer productivity in large Rust projects.

Understanding Rust Compilation Performance

Why Rust Compilation is Slow

Tips for faster Rust Compile Times explores the fundamental reasons behind Rust’s compilation characteristics:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66


// Rust's compilation model leads to inherent trade-offs
// Let's examine what makes compilation slow and how to optimize it

// 1. Monomorphization - Generic code is compiled for each concrete type
use std::collections::HashMap;

// This generic function will be compiled separately for each type used
fn process_collection<T: Clone + std::fmt::Debug>(items: &[T]) -> Vec<T> {
    println!("Processing {} items", items.len());
    items.iter().cloned().collect()
}

// Each usage creates a new monomorphized version
fn demonstrate_monomorphization() {
    let numbers = vec![1, 2, 3, 4, 5];
    let strings = vec!["a", "b", "c"];
    let floats = vec![1.0, 2.0, 3.0];
    
    // Three separate compiled versions of process_collection
    let _processed_numbers = process_collection(&numbers);     // process_collection<i32>
    let _processed_strings = process_collection(&strings);     // process_collection<&str>
    let _processed_floats = process_collection(&floats);       // process_collection<f64>
}

// 2. Dependency graph complexity - Each crate must be compiled in order
// Visualization of a typical dependency chain:
/*
your_app
├── serde (1.0.126)
│   └── serde_derive (1.0.126)
├── tokio (1.8.1)
│   ├── tokio-macros (1.3.0)
│   ├── pin-project-lite (0.2.6)
│   ├── bytes (1.0.1)
│   └── parking_lot (0.11.1)
│       └── parking_lot_core (0.8.3)
│           └── smallvec (1.6.1)
└── clap (2.33.3)
    ├── bitflags (1.2.1)
    ├── textwrap (0.11.0)
    └── ansi_term (0.11.0)
*/

// 3. Trait resolution complexity
trait ComplexTrait<T> {
    type Output;
    fn complex_method(&self, input: T) -> Self::Output;
}

// The compiler must resolve all these relationships
impl<T: Clone + std::fmt::Debug> ComplexTrait<T> for Vec<T> {
    type Output = String;
    
    fn complex_method(&self, _input: T) -> Self::Output {
        format!("Processed {} items", self.len())
    }
}

// Usage requires complex trait resolution
fn use_complex_trait<C, T>(container: &C, item: T) -> C::Output 
where 
    C: ComplexTrait<T>,
    T: Clone + std::fmt::Debug,
{
    container.complex_method(item)
}

Optimization Strategies

1. Dependency Management

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


# Cargo.toml optimization strategies

[dependencies]
# Use specific features instead of default-features
serde = { version = "1.0", default-features = false, features = ["derive"] }
tokio = { version = "1.0", default-features = false, features = ["rt-multi-thread", "net"] }

# Avoid unnecessary dependencies
# Instead of pulling in a large crate for one function:
# regex = "1.0"  # Heavy dependency

# Consider lighter alternatives or std library functions
# Use std::str for simple pattern matching when possible

[dev-dependencies]
# Keep test dependencies separate to avoid including them in release builds
criterion = "0.3"
proptest = "1.0"

[build-dependencies]
# Only include build dependencies when necessary
cc = "1.0"

# Profile-specific dependencies
[target.'cfg(unix)'.dependencies]
libc = "0.2"

[target.'cfg(windows)'.dependencies]
winapi = { version = "0.3", features = ["processenv"] }

2. Feature Flag Optimization

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43


// lib.rs - Use feature flags to conditionally compile code
#[cfg(feature = "serde")]
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub struct MyStruct {
    pub name: String,
    pub value: i32,
}

impl MyStruct {
    pub fn new(name: String, value: i32) -> Self {
        Self { name, value }
    }
    
    #[cfg(feature = "json")]
    pub fn to_json(&self) -> Result<String, serde_json::Error> {
        serde_json::to_string(self)
    }
    
    #[cfg(feature = "yaml")]
    pub fn to_yaml(&self) -> Result<String, serde_yaml::Error> {
        serde_yaml::to_string(self)
    }
}

// Optional expensive computations
#[cfg(feature = "advanced-math")]
pub mod advanced_math {
    pub fn complex_calculation(input: f64) -> f64 {
        // Expensive mathematical operations
        input.powi(10) + input.ln() + input.sin().cos()
    }
}

#[cfg(not(feature = "advanced-math"))]
pub mod advanced_math {
    pub fn complex_calculation(input: f64) -> f64 {
        // Simple fallback
        input * 2.0
    }
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


# Cargo.toml feature configuration
[features]
default = ["std"]
std = []
serde = ["dep:serde"]
json = ["serde", "dep:serde_json"]
yaml = ["serde", "dep:serde_yaml"]
advanced-math = []

# Optional dependencies (only compiled when features are enabled)
[dependencies]
serde = { version = "1.0", optional = true }
serde_json = { version = "1.0", optional = true }
serde_yaml = { version = "0.8", optional = true }

3. Incremental Compilation Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


# Set up environment for faster compilation
export CARGO_INCREMENTAL=1
export RUSTC_WRAPPER=sccache  # Use sccache for compilation caching

# Create .cargo/config.toml for project-specific settings
mkdir -p .cargo
cat > .cargo/config.toml << 'EOF'
[build]
# Use all available CPU cores
jobs = 0

# Enable incremental compilation
incremental = true

# Use faster linker on Linux
[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=lld"]

# Use faster linker on macOS
[target.x86_64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=lld"]

[cargo-new]
vcs = "git"
EOF

4. Compilation Profiles

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32


# Cargo.toml - Optimized compilation profiles

[profile.dev]
# Faster debug builds
opt-level = 1           # Some optimization for better debug performance
debug = true           # Keep debug info
incremental = true     # Enable incremental compilation
codegen-units = 256    # Parallel code generation (trades runtime speed for compile speed)

[profile.dev.package."*"]
# Compile dependencies with optimization even in debug mode
opt-level = 3
debug = false

[profile.release]
# Production optimizations
opt-level = 3
lto = "thin"           # Link-time optimization (slower compile, faster runtime)
codegen-units = 1      # Better runtime optimization
panic = "abort"        # Smaller binary size
strip = true           # Remove debug symbols

[profile.bench]
# Benchmarking profile
inherits = "release"
debug = true           # Keep debug info for profiling
lto = "fat"           # Aggressive LTO for maximum performance

[profile.test]
# Fast test compilation
inherits = "dev"
opt-level = 1

5. Workspace Optimization

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


# Workspace Cargo.toml for multi-crate projects
[workspace]
members = [
    "core",
    "cli",
    "web", 
    "utils"
]

# Shared dependencies across workspace
[workspace.dependencies]
serde = { version = "1.0", features = ["derive"] }
tokio = { version = "1.0", features = ["rt-multi-thread"] }
anyhow = "1.0"
thiserror = "1.0"

# Individual crate Cargo.toml can inherit workspace dependencies
# core/Cargo.toml
[dependencies]
serde = { workspace = true }
tokio = { workspace = true }
anyhow = { workspace = true }

# Local dependencies
utils = { path = "../utils" }

6. Build Script Optimization

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51


// build.rs - Efficient build scripts
use std::env;
use std::path::PathBuf;

fn main() {
    // Only rebuild when specific files change
    println!("cargo:rerun-if-changed=proto/");
    println!("cargo:rerun-if-changed=build.rs");
    
    // Skip expensive operations in debug builds
    let profile = env::var("PROFILE").unwrap_or_default();
    if profile == "debug" {
        println!("cargo:warning=Skipping expensive build operations in debug mode");
        return;
    }
    
    // Use parallel processing when possible
    let num_jobs = env::var("NUM_JOBS")
        .ok()
        .and_then(|s| s.parse().ok())
        .unwrap_or_else(|| std::thread::available_parallelism().unwrap().get());
    
    // Example: Protocol buffer compilation
    let proto_files = ["proto/api.proto", "proto/messages.proto"];
    
    // Configure protobuf compilation
    let mut config = prost_build::Config::new();
    config.out_dir("src/generated/");
    
    // Only recompile if proto files changed
    for proto_file in &proto_files {
        if is_newer_than_target(proto_file) {
            config.compile_protos(&[proto_file], &["proto/"]).unwrap();
        }
    }
}

fn is_newer_than_target(source: &str) -> bool {
    use std::fs;
    use std::time::SystemTime;
    
    let source_time = fs::metadata(source)
        .and_then(|m| m.modified())
        .unwrap_or(SystemTime::UNIX_EPOCH);
    
    let target_time = fs::metadata("src/generated/")
        .and_then(|m| m.modified())
        .unwrap_or(SystemTime::UNIX_EPOCH);
    
    source_time > target_time
}

7. Advanced Optimization Techniques

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83


// Advanced techniques for compile-time optimization

// 1. Reduce monomorphization by using trait objects
trait ProcessItem {
    fn process(&self) -> String;
}

// Instead of generic functions that get monomorphized:
fn process_items_generic<T: ProcessItem>(items: &[T]) -> Vec<String> {
    items.iter().map(|item| item.process()).collect()
}

// Use trait objects to reduce compilation:
fn process_items_dynamic(items: &[&dyn ProcessItem]) -> Vec<String> {
    items.iter().map(|item| item.process()).collect()
}

// 2. Use const generics sparingly
struct Matrix<const N: usize, const M: usize> {
    data: [[f64; M]; N],
}

// This creates many monomorphized versions - use judiciously
impl<const N: usize, const M: usize> Matrix<N, M> {
    fn new() -> Self {
        Self { data: [[0.0; M]; N] }
    }
}

// 3. Prefer composition over deep trait bounds
// Instead of:
// fn complex_function<T: Clone + Debug + Send + Sync + Serialize + DeserializeOwned>

// Consider creating a marker trait:
trait ComplexRequirements: Clone + std::fmt::Debug + Send + Sync {}

// Automatically implement for types that satisfy requirements
impl<T> ComplexRequirements for T 
where T: Clone + std::fmt::Debug + Send + Sync {}

fn complex_function<T: ComplexRequirements>(input: T) -> T {
    input
}

// 4. Use macros to reduce boilerplate without runtime cost
macro_rules! impl_display_for_enum {
    ($enum_name:ident { $($variant:ident),* }) => {
        impl std::fmt::Display for $enum_name {
            fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
                match self {
                    $(Self::$variant => write!(f, stringify!($variant)),)*
                }
            }
        }
    };
}

#[derive(Debug)]
enum Status {
    Ready,
    Processing, 
    Complete,
    Error,
}

impl_display_for_enum!(Status { Ready, Processing, Complete, Error });

// 5. Strategic use of inline annotations
#[inline]
fn small_frequently_called_function(x: i32) -> i32 {
    x * 2 + 1
}

#[inline(never)]
fn large_rarely_called_function() -> Vec<String> {
    // Large function that shouldn't be inlined
    (0..1000).map(|i| format!("Item {}", i)).collect()
}

#[inline(always)]
fn tiny_critical_function(x: u32) -> u32 {
    x.rotate_left(1)
}

Measurement and Monitoring

Compilation Time Analysis

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


# Measure compilation times
cargo clean
time cargo build

# Analyze what takes the most time
cargo build --timings
# This creates a cargo-timing.html file with detailed analysis

# Profile individual crate compilation
cargo build -Z timings --package your_crate_name

# Use cargo-llvm-lines to see which functions generate the most LLVM IR
cargo install cargo-llvm-lines
cargo llvm-lines | head -20

# Use cargo-bloat to analyze binary size
cargo install cargo-bloat
cargo bloat --release

# Analyze compilation bottlenecks
cargo +nightly build -Z time-passes 2>&1 | head -20

Dependency Analysis

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78


// analyze_deps.rs - Script to analyze dependency compilation impact
use std::process::Command;
use std::time::Instant;
use std::collections::HashMap;

fn main() {
    let dependencies = vec![
        "serde",
        "tokio", 
        "clap",
        "reqwest",
        "diesel",
    ];
    
    let mut compile_times = HashMap::new();
    
    for dep in dependencies {
        println!("Testing compilation time for {}", dep);
        
        // Create minimal test project
        let output = Command::new("cargo")
            .args(&["new", "--bin", &format!("test-{}", dep)])
            .output()
            .expect("Failed to create test project");
        
        if !output.status.success() {
            eprintln!("Failed to create project for {}", dep);
            continue;
        }
        
        // Add dependency
        let cargo_toml = format!(
            r#"
[package]
name = "test-{}"
version = "0.1.0"
edition = "2021"

[dependencies]
{} = "*"
"#,
            dep, dep
        );
        
        std::fs::write(format!("test-{}/Cargo.toml", dep), cargo_toml)
            .expect("Failed to write Cargo.toml");
        
        // Measure compilation time
        let start = Instant::now();
        let output = Command::new("cargo")
            .args(&["build"])
            .current_dir(&format!("test-{}", dep))
            .output()
            .expect("Failed to compile");
        
        let compile_time = start.elapsed();
        
        if output.status.success() {
            compile_times.insert(dep.to_string(), compile_time);
            println!("{}: {:?}", dep, compile_time);
        } else {
            eprintln!("Failed to compile {}", dep);
        }
        
        // Cleanup
        std::fs::remove_dir_all(format!("test-{}", dep))
            .unwrap_or_default();
    }
    
    // Sort by compilation time
    let mut sorted_times: Vec<_> = compile_times.iter().collect();
    sorted_times.sort_by_key(|(_, time)| *time);
    
    println!("\nCompilation times (fastest to slowest):");
    for (dep, time) in sorted_times {
        println!("{}: {:?}", dep, time);
    }
}

Continuous Integration Optimization

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70


# .github/workflows/ci.yml - Optimized CI for Rust projects
name: CI

on: [push, pull_request]

env:
  CARGO_TERM_COLOR: always
  RUSTFLAGS: "-D warnings"

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        rust-version: [stable, beta]
    
    steps:
    - uses: actions/checkout@v3
    
    # Cache Rust toolchain
    - name: Cache Rust toolchain
      uses: actions/cache@v3
      with:
        path: |
          ~/.cargo/registry
          ~/.cargo/git
          ~/.rustup
        key: ${{ runner.os }}-rust-${{ matrix.rust-version }}-${{ hashFiles('**/Cargo.lock') }}
        restore-keys: |
          ${{ runner.os }}-rust-${{ matrix.rust-version }}-
          ${{ runner.os }}-rust-
    
    # Install Rust
    - name: Install Rust
      uses: actions-rs/toolchain@v1
      with:
        toolchain: ${{ matrix.rust-version }}
        profile: minimal
        override: true
        components: rustfmt, clippy
    
    # Cache cargo build
    - name: Cache cargo build
      uses: actions/cache@v3
      with:
        path: target
        key: ${{ runner.os }}-cargo-${{ matrix.rust-version }}-${{ hashFiles('**/Cargo.lock') }}
        restore-keys: |
          ${{ runner.os }}-cargo-${{ matrix.rust-version }}-
          ${{ runner.os }}-cargo-
    
    # Check formatting first (fastest)
    - name: Check formatting
      run: cargo fmt --all -- --check
    
    # Run clippy (catches issues early)
    - name: Run clippy
      run: cargo clippy --all-targets --all-features -- -D warnings
    
    # Build (use --locked to ensure reproducible builds)
    - name: Build
      run: cargo build --locked --all-features
    
    # Test
    - name: Run tests
      run: cargo test --all-features
    
    # Build documentation
    - name: Build docs
      run: cargo doc --no-deps --all-features

Performance Benchmarking

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


// benches/compilation_benchmark.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};

// Benchmark functions that might affect compile time
fn expensive_generic_function<T: Clone + std::fmt::Debug>(items: &[T]) -> Vec<T> {
    items.iter().cloned().collect()
}

fn optimized_function(items: &[i32]) -> Vec<i32> {
    items.to_vec()
}

fn benchmark_monomorphization(c: &mut Criterion) {
    let data = vec![1, 2, 3, 4, 5];
    
    c.bench_function("generic function", |b| {
        b.iter(|| expensive_generic_function(black_box(&data)))
    });
    
    c.bench_function("specialized function", |b| {
        b.iter(|| optimized_function(black_box(&data)))
    });
}

criterion_group!(benches, benchmark_monomorphization);
criterion_main!(benches);

💡 Rust Compilation Optimization Summary

Key Strategies for Faster Rust Compilation:

Minimize dependencies and use feature flags to avoid unused code
Configure profiles appropriately for development vs release
Use incremental compilation and caching tools like sccache
Optimize workspace structure for parallel compilation
Reduce monomorphization through trait objects and type erasure
Profile compilation to identify bottlenecks
Cache aggressively in CI/CD pipelines

Practical Implementation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39


// Example of a compile-time optimized Rust project structure
// src/lib.rs

#![warn(missing_docs)]
#![cfg_attr(not(feature = "std"), no_std)]

// Feature-gated modules
#[cfg(feature = "serde")]
pub mod serialization;

#[cfg(feature = "async")]
pub mod async_utils;

// Core functionality always available
pub mod core;
pub mod utils;

// Re-exports for convenience
pub use core::*;

// Conditional re-exports
#[cfg(feature = "serde")]
pub use serialization::*;

#[cfg(feature = "async")]
pub use async_utils::*;

// Type aliases to reduce monomorphization
pub type Result<T> = std::result::Result<T, Box<dyn std::error::Error + Send + Sync>>;
pub type HashMap<K, V> = std::collections::HashMap<K, V, ahash::RandomState>;

// Prelude module for common imports
pub mod prelude {
    pub use crate::core::*;
    pub use crate::Result;
    
    #[cfg(feature = "serde")]
    pub use crate::serialization::*;
}

This comprehensive approach to Rust compilation optimization demonstrates that while Rust compilation can be slow, there are many strategies available to significantly improve build times while maintaining the language’s safety and performance guarantees.

Today’s exploration of Rust compilation optimization reinforced that development velocity in systems programming languages requires careful attention to build performance, and that strategic choices in dependencies, features, and project structure can dramatically impact developer experience.

Understanding Rust Compilation Performance#

Why Rust Compilation is Slow#

Optimization Strategies#

1. Dependency Management#

2. Feature Flag Optimization#

3. Incremental Compilation Configuration#

4. Compilation Profiles#

5. Workspace Optimization#

6. Build Script Optimization#

7. Advanced Optimization Techniques#

Measurement and Monitoring#

Compilation Time Analysis#

Dependency Analysis#

Continuous Integration Optimization#

Performance Benchmarking#

Practical Implementation#