Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next-gen solver: Compiler "hang" & eventual OOM when facing complex mutually recursive structs & many projections #126196

Closed
fmease opened this issue Jun 9, 2024 · 7 comments
Labels
A-associated-items Area: Associated items such as associated types and consts. A-traits Area: Trait system C-bug Category: This is a bug. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative

Comments

@fmease
Copy link
Member

fmease commented Jun 9, 2024

Context / History

The original issue is a rustdoc[old solver] compiletime issue: #114891.
I was able to reduce that issue to 211 LOC: #114891 (comment).

I then looked into switching rustdoc's blanket impl synthesizer to use the next solver: #125907.
Ignoring the fact that that's blocked on perf, it would actually make #114891 (comment) hang and alloc until the OS's OOM killer swoops in.

Finally, the reproducer below (to be minimized) is the rustdoc[next solver] "MCVE" turned into a rustc[next solver] issue by "reifying" the obligations registered by blanket_impl as Rust code.

Fixing this issue would partially unblock #125907 / #114891.

Reproducer

Old solver: Error type annotations needed // cannot infer type of the type parameter T declared on the function check.
Next solver: Compiler hang & eventual OOM.

Reproducer (215 LOC)
use std::sync::{Mutex, RwLock};
use std::{
    collections::HashMap,
    sync::{Arc, Weak},
};

pub trait Api: Clone {
    type Instance: Instance<A = Self>;
    type Surface: Surface<A = Self>;
    type Adapter: Adapter<A = Self>;
    type Device: DeviceTr<A = Self>;

    type Queue: QueueTr<A = Self>;
    type CommandEncoder: CommandEncoder<A = Self>;

    type CommandBuffer: WasmNotSendSync;

    type Buffer: WasmNotSendSync + 'static;
    type Texture: WasmNotSendSync + 'static;
    type SurfaceTexture: WasmNotSendSync + std::borrow::Borrow<Self::Texture>;
    type TextureView: WasmNotSendSync;
    type Sampler: WasmNotSendSync;
    type QuerySet: WasmNotSendSync;
    type Fence: WasmNotSendSync;

    type BindGroupLayout: WasmNotSendSync;
    type BindGroup: WasmNotSendSync;
    type PipelineLayout: WasmNotSendSync;
    type ShaderModule: WasmNotSendSync;
    type RenderPipeline: WasmNotSendSync;
    type ComputePipeline: WasmNotSendSync;

    type AccelerationStructure: WasmNotSendSync + 'static;
}

pub trait Instance: Sized + WasmNotSendSync {
    type A: Api;
}

pub trait Surface: WasmNotSendSync {
    type A: Api;
}

pub trait Adapter: WasmNotSendSync {
    type A: Api;
}

pub trait DeviceTr: WasmNotSendSync {
    type A: Api;
}

pub trait QueueTr: WasmNotSendSync {
    type A: Api;
}

pub trait CommandEncoder: WasmNotSendSync {
    type A: Api;
}

pub trait WasmNotSendSync: WasmNotSend + WasmNotSync {}
impl<T: WasmNotSend + WasmNotSync> WasmNotSendSync for T {}

pub trait WasmNotSend: Send {}
impl<T: Send> WasmNotSend for T {}

pub trait WasmNotSync: Sync {}
impl<T: Sync> WasmNotSync for T {}

trait HalApi: Api + 'static {}

struct BindGroup<A: HalApi> {
    raw: A::BindGroup,
    device: Arc<Device<A>>,
    layout: Arc<A>,
    info: ResourceInfo<BindGroup<A>>,
    used: BindGroupStates<A>,
    used_buffer_ranges: Vec<A>,
    used_texture_ranges: Vec<A>,
}

struct BindGroupStates<A: HalApi> {
    buffers: BufferBindGroupState<A>,
    textures: TextureBindGroupState<A>,
    views: TextureView<A>,
    samplers: Sampler<A>,
}

type UsageScopePool<A> = Mutex<Vec<(BufferUsageScope<A>, TextureUsageScope<A>)>>;

struct Tracker<A: HalApi> {
    buffers: BufferTracker<A>,
    textures: TextureTracker<A>,
    views: TextureView<A>,
    samplers: Sampler<A>,
    bind_groups: crate::BindGroup<A>,
    compute_pipelines: A,
    render_pipelines: A,
    bundles: A,
    query_sets: QuerySet<A>,
}

struct BufferBindGroupState<A: HalApi> {
    buffers: Mutex<Vec<Arc<Buffer<A>>>>,
}
struct BufferUsageScope<A: HalApi> {
    metadata: Buffer<A>,
}

struct BufferTracker<A: HalApi> {
    metadata: Buffer<A>,
}

struct TextureBindGroupState<A: HalApi> {
    textures: Mutex<Vec<A>>,
}
struct TextureUsageScope<A: HalApi>(A);

struct TextureTracker<A: HalApi> {
    _phantom: std::marker::PhantomData<A>,
}

struct ResourceInfo<T> {
    marker: std::marker::PhantomData<T>,
}

struct Buffer<A: HalApi> {
    raw: A::Buffer,
    device: Arc<Device<A>>,
    info: ResourceInfo<Buffer<A>>,
    bind_groups: Mutex<Vec<Weak<BindGroup<A>>>>,
}

struct DestroyedBuffer<A: HalApi> {
    raw: Option<A::Buffer>,
    device: Arc<Device<A>>,
    bind_groups: Vec<Weak<BindGroup<A>>>,
}

struct StagingBuffer<A: HalApi> {
    raw: Mutex<Option<A::Buffer>>,
    device: Arc<Device<A>>,
    info: ResourceInfo<StagingBuffer<A>>,
}

enum TextureInner<A: HalApi> {
    Native(A::Texture),
    Surface(Option<A::SurfaceTexture>),
}

enum TextureClearMode<A: HalApi> {
    RenderPass(Vec<Option<A::TextureView>>),
    Surface(Option<A::TextureView>),
}

struct Texture<A: HalApi> {
    inner: TextureInner<A>,
    device: Arc<Device<A>>,
    info: ResourceInfo<Texture<A>>,
    clear_mode: RwLock<TextureClearMode<A>>,
    views: Mutex<Vec<Weak<TextureView<A>>>>,
    bind_groups: Mutex<Vec<Weak<BindGroup<A>>>>,
}

struct DestroyedTexture<A: HalApi> {
    raw: Option<A::Texture>,
    views: Vec<Weak<TextureView<A>>>,
    bind_groups: Vec<Weak<BindGroup<A>>>,
    device: Arc<Device<A>>,
}

struct TextureView<A: HalApi> {
    raw: A::TextureView,
    parent: Arc<Texture<A>>,
    device: Arc<Device<A>>,
    info: ResourceInfo<TextureView<A>>,
}

struct Sampler<A: HalApi> {
    raw: Option<A::Sampler>,
    device: Arc<Device<A>>,
    info: ResourceInfo<Self>,
}

struct QuerySet<A: HalApi> {
    raw: Option<A::QuerySet>,
    device: Arc<Device<A>>,
    info: ResourceInfo<Self>,
}

struct Device<A: HalApi> {
    raw: Option<A::Device>,
    adapter: Arc<A>,
    queue: Weak<Queue<A>>,
    queue_to_drop: A::Queue,
    zero_buffer: Option<A::Buffer>,
    info: ResourceInfo<Device<A>>,
    command_allocator: A,
    fence: RwLock<Option<A::Fence>>,
    trackers: Mutex<Tracker<A>>,
    life_tracker: Mutex<LifetimeTracker<A>>,
    temp_suspected: Mutex<Option<ResourceMaps<A>>>,
    pending_writes: Mutex<Option<PendingWrites<A>>>,
    usage_scopes: UsageScopePool<A>,
}

struct Queue<A: HalApi> {
    device: Option<Arc<Device<A>>>,
    raw: Option<A::Queue>,
    info: ResourceInfo<Queue<A>>,
}

struct EncoderInFlight<A: HalApi> {
    marker: std::marker::PhantomData<A>,
}
struct PendingWrites<A: HalApi> {
    command_encoder: A::CommandEncoder,
    dst_buffers: HashMap<i32, Arc<Buffer<A>>>,
    dst_textures: HashMap<i32, Arc<Texture<A>>>,
    executing_command_buffers: Vec<A::CommandBuffer>,
}

struct ResourceMaps<A: HalApi> {
    buffers: HashMap<i32, Arc<Buffer<A>>>,
    staging_buffers: HashMap<i32, Arc<StagingBuffer<A>>>,
    textures: HashMap<i32, Arc<Texture<A>>>,
    texture_views: HashMap<i32, Arc<TextureView<A>>>,
    samplers: HashMap<i32, Arc<Sampler<A>>>,
    bind_groups: HashMap<i32, Arc<BindGroup<A>>>,
    bind_group_layouts: HashMap<i32, Arc<A>>,
    render_pipelines: HashMap<i32, Arc<A>>,
    compute_pipelines: HashMap<i32, Arc<A>>,
    pipeline_layouts: HashMap<i32, Arc<A>>,
    render_bundles: HashMap<i32, Arc<A>>,
    query_sets: HashMap<i32, Arc<QuerySet<A>>>,
    destroyed_buffers: HashMap<i32, Arc<DestroyedBuffer<A>>>,
    destroyed_textures: HashMap<i32, Arc<DestroyedTexture<A>>>,
}
struct ActiveSubmission<A: HalApi> {
    last_resources: ResourceMaps<A>,
    mapped: Vec<Arc<Buffer<A>>>,
    encoders: Vec<EncoderInFlight<A>>,
}

struct LifetimeTracker<A: HalApi> {
    mapped: Vec<Arc<Buffer<A>>>,
    future_suspected_buffers: Vec<Arc<Buffer<A>>>,
    future_suspected_textures: Vec<Arc<Texture<A>>>,
    suspected_resources: ResourceMaps<A>,
    active: Vec<ActiveSubmission<A>>,
    ready_to_map: Vec<Arc<Buffer<A>>>,
}

fn main() {
    check::<BindGroup<_>>();
    fn check<T: WasmNotSync>() {}
}
@fmease fmease added A-traits Area: Trait system A-associated-items Area: Associated items such as associated types and consts. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. C-bug Category: This is a bug. I-hang Issue: The compiler never terminates, due to infinite loops, deadlock, livelock, etc. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example T-types Relevant to the types team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative labels Jun 9, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jun 9, 2024
@fmease fmease removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jun 9, 2024
@fmease
Copy link
Member Author

fmease commented Jun 9, 2024

cc @lcnr

@fmease
Copy link
Member Author

fmease commented Jun 9, 2024

Ofc, I will try to minimize this in the coming hours or days.

@fmease
Copy link
Member Author

fmease commented Jun 9, 2024

Slightly smaller:

Reproducer (78 LOC)
use std::sync::{Mutex, Arc, Weak};
use std::collections::HashMap;

pub trait Api: Clone {
    type Surface: Surface<A = Self>;
    type Buffer: WasmNotSendSync + 'static;
    type Texture: WasmNotSendSync + 'static;
    type SurfaceTexture: WasmNotSendSync + std::borrow::Borrow<Self::Texture>;
    type TextureView: WasmNotSendSync;
    type Sampler: WasmNotSendSync;
    type QuerySet: WasmNotSendSync;
}

pub trait Surface: WasmNotSendSync {
    type A: Api;
}

pub trait WasmNotSendSync: WasmNotSend + WasmNotSync {}
impl<T: WasmNotSend + WasmNotSync> WasmNotSendSync for T {}

pub trait WasmNotSend: Send {}
impl<T: Send> WasmNotSend for T {}

pub trait WasmNotSync: Sync {}
impl<T: Sync> WasmNotSync for T {}

trait HalApi: Api + 'static {}

struct BindGroup<A: HalApi>(Box<Device<A>>);

struct Device<A: HalApi>(LifetimeTracker<A>);

struct LifetimeTracker<A: HalApi>(
    Vec<Arc<Texture<A>>>,
    ResourceMaps<A>,
    Vec<ActiveSubmission<A>>,
);

struct Buffer<A: HalApi>(
    A::Buffer,
    Arc<Device<A>>,
    std::marker::PhantomData<Buffer<A>>,
    Mutex<Vec<Weak<BindGroup<A>>>>,
);

struct StagingBuffer<A: HalApi>(
    Mutex<Option<A::Buffer>>,
    Arc<Device<A>>,
    std::marker::PhantomData<StagingBuffer<A>>,
);

struct Texture<A: HalApi>(
    Arc<Device<A>>,
    std::marker::PhantomData<Texture<A>>,
    Mutex<Vec<Weak<TextureView<A>>>>,
    Mutex<Vec<Weak<BindGroup<A>>>>,
);

struct TextureView<A: HalApi>(
    A::TextureView,
    Arc<Texture<A>>,
    Arc<Device<A>>,
    std::marker::PhantomData<TextureView<A>>,
);

struct Sampler<A: HalApi>(
    Option<A::Sampler>,
    Arc<Device<A>>,
    std::marker::PhantomData<Self>,
);

struct ResourceMaps<A: HalApi>(
    HashMap<i32, Arc<Buffer<A>>>,
    HashMap<i32, Arc<StagingBuffer<A>>>,
    HashMap<i32, Arc<Texture<A>>>,
    HashMap<i32, Arc<TextureView<A>>>,
    HashMap<i32, Arc<Sampler<A>>>,
    HashMap<i32, Arc<BindGroup<A>>>,
    HashMap<i32, Arc<A>>,
    HashMap<i32, Arc<A>>,
    HashMap<i32, Arc<A>>,
    HashMap<i32, Arc<A>>,
    HashMap<i32, Arc<A>>,
);

struct ActiveSubmission<A: HalApi>(
    ResourceMaps<A>,
    Vec<Arc<Buffer<A>>>,
    Vec<std::marker::PhantomData<A>>,
);

fn main() {
    check::<BindGroup<_>>();
    fn check<T: WasmNotSync>() {}
}

Even smaller:

Reproducer (61 LOC)
use std::sync::{Mutex, Arc, Weak};

pub trait Api {
    type Surface: Surface<A = Self>;
    type Buffer: WasmNotSendSync + 'static;
    type Texture: WasmNotSendSync + 'static;
    type SurfaceTexture: WasmNotSendSync + std::borrow::Borrow<Self::Texture>;
    type TextureView: WasmNotSendSync;
    type Sampler: WasmNotSendSync;
    type QuerySet: WasmNotSendSync;
}

pub trait Surface: WasmNotSendSync {
    type A: Api;
}

pub trait WasmNotSendSync: WasmNotSend + WasmNotSync {}
impl<T: WasmNotSend + WasmNotSync> WasmNotSendSync for T {}

pub trait WasmNotSend: Send {}
impl<T: Send> WasmNotSend for T {}

pub trait WasmNotSync: Sync {}
impl<T: Sync> WasmNotSync for T {}

trait HalApi: Api {}

struct BindGroup<A: HalApi>(Box<Device<A>>);

struct Device<A: HalApi>(ResourceMaps<A>);

struct Buffer<A: HalApi>(
    A::Buffer,
    Arc<Device<A>>,
    std::marker::PhantomData<Buffer<A>>,
    Mutex<Vec<Weak<BindGroup<A>>>>,
);

struct StagingBuffer<A: HalApi>(
    Mutex<Option<A::Buffer>>,
    Arc<Device<A>>,
    std::marker::PhantomData<StagingBuffer<A>>,
);

struct Texture<A: HalApi>(
    Arc<Device<A>>,
    std::marker::PhantomData<Texture<A>>,
    Mutex<Vec<Weak<TextureView<A>>>>,
    Mutex<Vec<Weak<BindGroup<A>>>>,
);

struct TextureView<A: HalApi>(
    A::TextureView,
    Arc<Texture<A>>,
    Arc<Device<A>>,
    std::marker::PhantomData<TextureView<A>>,
);

struct Sampler<A: HalApi>(
    Option<A::Sampler>,
    Arc<Device<A>>,
    std::marker::PhantomData<Self>,
);

struct ResourceMaps<A: HalApi>(
    Arc<Buffer<A>>,
    Arc<StagingBuffer<A>>,
    Arc<TextureView<A>>,
    Arc<Sampler<A>>,
    Arc<BindGroup<A>>,
);

fn main() {
    check::<BindGroup<_>>();
    fn check<T: WasmNotSync>() {}
}

@fmease
Copy link
Member Author

fmease commented Jun 9, 2024

"Even smaller" (no longer I-hang but still I-compiletime (32.31s on my machine) + I-compilemem (9GB+)):

"Reproducer" (53 LOC)
use std::sync::{Mutex, Arc, Weak};

pub trait HalApi {
    type Surface: Surface<A = Self>;
    type Buffer: WasmNotSendSync;
    type Texture: WasmNotSendSync ;
    type SurfaceTexture: WasmNotSendSync;
    type TextureView: WasmNotSendSync;
    type Sampler: WasmNotSendSync;
}

pub trait Surface: WasmNotSendSync {
    type A: HalApi;
}

pub trait WasmNotSendSync: WasmNotSend + WasmNotSync {}
impl<T: WasmNotSend + WasmNotSync> WasmNotSendSync for T {}

pub trait WasmNotSend: Send {}
impl<T: Send> WasmNotSend for T {}

pub trait WasmNotSync: Sync {}
impl<T: Sync> WasmNotSync for T {}

struct BindGroup<A: HalApi>(Box<Device<A>>);

struct Device<A: HalApi>(ResourceMaps<A>);

struct Buffer<A: HalApi>(
    A::Buffer,
    Arc<Device<A>>,
    std::marker::PhantomData<Buffer<A>>,
    Mutex<Vec<Weak<BindGroup<A>>>>,
);

struct Texture<A: HalApi>(
    Arc<Device<A>>,
    std::marker::PhantomData<Texture<A>>,
    Mutex<Vec<Weak<TextureView<A>>>>,
    Mutex<Vec<Weak<BindGroup<A>>>>,
);

struct TextureView<A: HalApi>(
    A::TextureView,
    Arc<Texture<A>>,
    Arc<Device<A>>,
    std::marker::PhantomData<TextureView<A>>,
);

struct Sampler<A: HalApi>(
    Option<A::Sampler>,
    Arc<Device<A>>,
    std::marker::PhantomData<Self>,
);

struct ResourceMaps<A: HalApi>(
    Arc<Buffer<A>>,
    Arc<TextureView<A>>,
    Arc<Sampler<A>>,
    Arc<BindGroup<A>>,
);

fn main() {
    check::<BindGroup<_>>();
    fn check<T: WasmNotSync>() {}
}

Minimization is prone to distortion (at least with the way I was just pursuing). Stopping right now, will pick it up some other time and will try different approaches that are less aggressive.

@fmease fmease changed the title Next-gen solver: Compiler hang & eventual OOM when facing complex mutually recursive structs & many projections Next-gen solver: Compiler "hang" & eventual OOM when facing complex mutually recursive structs & many projections Jun 9, 2024
@fmease fmease added I-compiletime Issue: Problems and improvements with respect to compile times. and removed I-hang Issue: The compiler never terminates, due to infinite loops, deadlock, livelock, etc. labels Jun 9, 2024
@fmease
Copy link
Member Author

fmease commented Jun 9, 2024

(Replacing I-hang Issue: The compiler never terminates, due to infinite loops, deadlock, livelock, etc. with I-compiletime Issue: Problems and improvements with respect to compile times. because I don't think this is a "true" I-hang (spinning/looping indefinitely w/o memory accumulating). For all practical intents and purposes, I would consider the first three reproducers to be I-hang on my machine since proper termination is not observable before OOM death. However, that's only due to resource constraints I presume. On a more powerful machine the original reproducer might eventually terminate properly, I'd wager)

@lcnr
Copy link
Contributor

lcnr commented Jun 11, 2024

should be fixed by #125981

@fmease
Copy link
Member Author

fmease commented Aug 15, 2024

Fixed by #128828.

@fmease fmease closed this as completed Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-associated-items Area: Associated items such as associated types and consts. A-traits Area: Trait system C-bug Category: This is a bug. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative
Projects
None yet
Development

No branches or pull requests

3 participants