About 50 results
Open links in new tab
  1. EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic ...

    Sep 16, 2025 · A fundamental limitation of current AI agents is their inability to learn complex skills on the fly at test time, often behaving like “clever but clueless interns” in novel environments. This …

  2. CLEVER: A Curated Benchmark for Formally Verified Code Generation

    Jul 8, 2025 · TL;DR: We introduce CLEVER, a hand-curated benchmark for verified code generation in Lean. It requires full formal specs and proofs. No few-shot method solves all stages, making it a …

  3. Submissions | OpenReview

    Jan 22, 2025 · Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers Lorenzo Pacchiardi, Marko Tesic, Lucy G Cheke, Jose Hernandez-Orallo 27 Sept 2024 …

  4. HashAttention: Semantic Sparsity for Faster Inference

    May 1, 2025 · Using clever mathematical tricks and learned functions, we represent the tokens in a compact format that allows fast comparisons using simple bitwise operations. HashAttention speeds …

  5. Progressive Growing of GANs for Improved Quality, Stability, and ...

    Feb 15, 2018 · We train generative adversarial networks in a progressive fashion, enabling us to generate high-resolution images with high quality.

  6. STAIR: Improving Safety Alignment with Introspective Reasoning

    May 1, 2025 · One common approach is training models to refuse unsafe queries, but this strategy can be vulnerable to clever prompts, often referred to as jailbreak attacks, which can trick the AI into …

  7. Bridging Symmetry and Robustness: On the Role of Equivariance in...

    Sep 18, 2025 · As a result, the CLEVER-certified robustness bounds derived under the assumption of norm invariance do not apply directly to scale-equivariant networks. Instead, these bounds must be …

  8. Do Histopathological Foundation Models Eliminate Batch Effects? A ...

    Oct 11, 2024 · Deep learning has led to remarkable advancements in computational histopathology, e.g., in diagnostics, biomarker prediction, and outcome prognosis. Yet, the lack of annotated data …

  9. Contrastive Learning Via Equivariant Representation - OpenReview

    Sep 25, 2024 · In this paper, we revisit the roles of augmentation strategies and equivariance in improving CL's efficacy. We propose CLeVER (Contrastive Learning Via Equivariant …

  10. We introduce CLEVER, the first curated benchmark for evaluating the generation of specifications and formally verified code in Lean. The benchmark comprises of 161 programming problems; it evaluates …