Apple Brings M5 Max LLM Inference and SHARP 3D to ICLR 2026

The conference hall in Rio de Janeiro will smell like coffee and solder this week, but the real heat is coming from booth #204. Apple is walking into ICLR 2026 with two demos that quietly answer a question the research community has been circling for months: how much of frontier AI can you actually run on a device you already own?

Section 1: What Apple Is Showing at ICLR 2026

The Fourteenth International Conference on Learning Representations (ICLR) runs from April 23 to 27 in Rio de Janeiro, Brazil, with exhibition hours from 9:30 AM to 5:30 PM BRT each day from Thursday through Saturday. Apple is sponsoring the conference again and has set up shop at booth #204.

Two demos anchor Apple's presence. The first is local LLM inference on a MacBook Pro equipped with the M5 Max chip, running entirely through MLX — Apple's open-source array framework purpose-built for Apple silicon. The stack includes MLX, mlx-lm, and the model weights, all open source. The demo runs a quantized frontier coding model natively inside Xcode's development environment. No cloud calls, no API keys, no latency.

The second demo is SHARP, which processes either pre-recorded images or images captured live during the demo. Visitors select an image, SHARP processes it, and the resulting 3D Gaussian point cloud renders on an iPad Pro with the M5 chip. The pipeline is fast enough to feel interactive.

Apple's organizational footprint at ICLR 2026 is substantial. Carl Vondrick serves as General Chair. Alexander Toshev and Vladlen Koltun are Senior Area Chairs. The Area Chairs list includes Carl Vondrick, Eugene Ndiaye, Fartash Faghri, Jiatao Gu, Joao Monteiro, Miguel Angel Bautista, Philipp Krähenbühl, Pierre Ablin, Shuangfei Zhai, Yizhe Zhang, and Zhe Gan.

Workshop leadership includes Arno Blaas as Workshop Co-Organizer and Nicholas Apostoloff and Niv Sivakumar as Workshop Reviewers for "I Can't Believe It's Not Better: Challenges in Applied Deep Learning (ICBINB) 2026." Shirley Zou is Workshop Co-Organizer for "AI with Recursive Self-Improvement 2026."

The reviewer roster spans dozens of Apple researchers: Adam Golinski, Anastasiia Filippova, Andrew Silva, Andrew Szot, Arnav Kundu, Arno Blaas, Artem Sevastopolsky, Arwen Bradley, Barry-John Theobald, Chen Chen, Cheng-Yu Hsieh, Devon Hjelm, Gregor Bachmann, Honor Chen, Luca Zappella, Manjot Bilkhu, Meng Cao, Michael Kirchhof, Miguel Sarabia, Mohamad Shahbazi, Nicholas Apostoloff, Nikhil Bhendawade, Nivedha Sivakumar, Noam Elata, Omar Attia, Parth Thakkar, Parshin Shojaee, Peter Grasch, Ping Wang, Ran Liu, Raviteja Vemulapalli, Richard Bai, Roy Xie, Vikramjit Mitra, Vimal Thilak, and Zijin Gu.

Several accepted papers carry Apple co-authors. One team from UCSD and Apple — Murray Kang, Yizhe Zhang, Nikki Kuang, Nicklas Majamaki, Navdeep Jaitly, Yian Ma, and Lianhui Qin — contributed work. Another from HKUST and Apple — Wei Liu, Ruochen Zhou, Yiyun Deng, Yuzhen Huang, Jaunting Liu, Yuntian Deng, Yizhe Zhang, and Junxian He — also published. A third group from the University of Pennsylvania and Apple — Wenrui Ma, Ran Liu, Ellen Zippi, Chris Sandino, Juri Minxha, Behrooz Mahasseni, Erdrin Azemi, Ali Moin, and Eva Dyer — contributed. Joao Monteiro, Anastasiia Filippova, David Grangier, and Marco Cuturi from Apple also have a paper.

Section 2: What's Actually Different This Year

Apple has been showing up at ICLR for years, but the 2026 edition marks a shift in emphasis. Previous years leaned heavily on theoretical contributions and dataset releases. This year, the booth demos are the story.

The MLX demo on the M5 Max MacBook Pro is not a toy. Running a quantized frontier coding model entirely inside Xcode means the development loop — write code, test against an LLM, iterate — can happen without leaving the IDE or touching a server. For the research community, the fact that the full stack (MLX, mlx-lm, and model weights) is open source means anyone can fork it, modify the quantization scheme, or swap in a different architecture. Apple is not just showing a capability; it is handing out the tools to replicate and extend it.

The SHARP demo on the iPad Pro M5 is the other half of the same thesis. Real-time 3D Gaussian point cloud generation from a single image, running on a tablet, is the kind of capability that usually requires a workstation with a discrete GPU. Apple is betting that on-device inference for vision tasks is ready for prime time, and the M5's neural engine is the reason.

Compare this to the broader ICLR program. Most papers still train on clusters and evaluate on static benchmarks. Apple is showing something rarer: a production-ready inference path that fits in a backpack. The contrast between the cloud-dependent research paradigm and Apple's local-first approach is the quiet tension running through the conference this year.

Resolution

By the time the exhibition hall closes on Saturday, the takeaway for researchers walking past booth #204 will be that Apple has turned its silicon advantage into a research distribution channel — and that the next frontier model you run might never leave your laptop.

Apple Brings M5 Max LLM Inference and SHARP 3D to ICLR 2026

Section 1: What Apple Is Showing at ICLR 2026

Section 2: What's Actually Different This Year

Resolution

Related Articles