Byte Latent Transformer: Patches Scale Better Than Tokens (2024) NEW
The Platonic Representation Hypothesis (2024)
Helpless infants are learning a foundation model (2024)
Position: LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks (2024) NEW
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources (2023) NEW
Convolutional architectures are cortex-aligned de novo (2023)
SemanticCMC: Contrastive Learning of Meaningful Object Associations from Temporal Co-occurrence Patterns in Naturalistic Movies (2023)
Scaling Laws for Neural Language Models (2023) NEW
Deep learning-based rigid motion correction for magnetic resonance imaging: A survey (2023) NEW
Abductive Knowledge Induction From Raw Data (2021) NEW
Attention Is All You Need (2017)
Principles of Philosophy (2017)
Why Brains Are Not Computers, Why Behaviorism Is Not Satanism, and Why Dolphins Are Not Aquatic Apes (2015) NEW
The remarkable, yet not extraordinary, human brain as a scaled-up primate brain and its associated cost (2012)
Universal Intelligence: A Definition of Machine Intelligence (2007) NEW
Centaur - TODO NEW
Gemini paper - TODO NEW
Scaling Monosemanticity - TODO NEW