Am Neumarkt 😱

Machine learning and other gibberish
See also: https://sharing.leima.is
Archives: https://datumorphism.leima.is/amneumarkt/

07:53 · Nov 21, 2025 · Fri

#dl

Introducing more symmetries in attention

https://github.com/NVIDIA/torch-harmonics

https://neurips.cc/virtual/2025/loc/san-diego/poster/117783

GitHub

GitHub - NVIDIA/torch-harmonics: Differentiable signal processing on the sphere for PyTorch

Differentiable signal processing on the sphere for PyTorch - NVIDIA/torch-harmonics

07:41 · Oct 5, 2025 · Sun

#dl

Park, Chanwook, Sourav Saha, Jiachen Guo, Hantao Zhang, Xiaoyu Xie, Miguel A. Bessa, Dong Qian, et al. 2025. “Unifying Machine Learning and Interpolation Theory via Interpolating Neural Networks.” Nature Communications 16 (1): 1–12.
https://www.nature.com/articles/s41467-025-63790-8

07:04 · Jun 28, 2025 · Sat

#dl

A few cool ideas in this model.

Introducing Gemma 3n: The developer guide - Google Developers Blog
https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide/

Googleblog

Google for Developers Blog - News about Web, Mobile, AI and Cloud

Learn how to build with Gemma 3n, a mobile-first architecture, MatFormer technology, Per-Layer Embeddings, and new audio and vision encoders.

08:43 · Jun 14, 2025 · Sat

#dl

So tensorflow and jax are deprecated in the transformers package.

https://github.com/huggingface/transformers/pull/38758

GitHub

Deprecate TF + JAX by Rocketknight1 · Pull Request #38758 · huggingface/transformers

The time has finally come 🔫 🥃

07:42 · Feb 21, 2025 · Fri

#dl

Sure, nobody uses it anyways.

21:31 · Oct 3, 2024 · Thu

#dl

PyTorch Native Architecture Optimization: torchao | PyTorch
https://pytorch.org/blog/pytorch-native-architecture-optimization/

18:07 · Oct 3, 2024 · Thu

#dl

https://soumith.ch/blog/2024-10-02-training-10k-scale.md.html

05:48 · Jul 20, 2024 · Sat

#dl

There is this new lib called scale. One could compile CUDA code to use it on AMD GPU.

https://docs.scale-lang.com/manual/how-to-use/

I don't know who is more pissed off, NVidia or AMD.

09:09 · Mar 16, 2024 · Sat

#dl

This repo is really nice.

yuanchenyang/smalldiffusion: Simple and readable code for training and sampling from diffusion models
https://github.com/yuanchenyang/smalldiffusion

GitHub

GitHub - yuanchenyang/smalldiffusion: Simple and readable code for training and sampling from diffusion models

Simple and readable code for training and sampling from diffusion models - yuanchenyang/smalldiffusion

08:30 · Nov 13, 2023 · Mon

#dl

Google & USC benchmarked a prompt based forecasting method, and the results are amazing.

Cao D, Jia F, Arik SO, Pfister T, Zheng Y, Ye W, et al. TEMPO: Prompt-based Generative Pre-trained Transformer for time series forecasting. arXiv [cs.LG]. 2023. Available: http://arxiv.org/abs/2310.04948

06:13 · Jun 12, 2023 · Mon

#dl

https://twitter.com/armin_kekic/status/1667181047480479751?s=20

X (formerly Twitter)

Armin Kekić (@armin_kekic) on X

Glad we can finally share this publicly: new paper on how @ZalandoTech uses deep-learning based forecasting for algorithmic pricing. A rare insight into how the machine learning is used to solve real-world problems. 🧵 1/

📃: https://t.co/Li9aJCiZ3g

09:27 · Mar 28, 2023 · Tue

#dl

I am experimenting with torch 2.0 and searching for potential training time improvements in lightning. The following article provides a very good introduction.

https://lightning.ai/pages/community/tutorial/how-to-speed-up-pytorch-model-training/

Lightning AI

How to Speed Up PyTorch Model Training

Learn how to improve the training performance of your PyTorch model without compromising its accuracy.

14:53 · Mar 16, 2023 · Thu

#dl

https://github.com/Lightning-AI/lightning/releases/tag/2.0.0

You can compile (torch 2.0) LightningModule now.

import torch 
import lightning as L 
model = LitModel() 
# This will compile forward and {training,validation,test,predict}_step 
compiled_model = torch.compile(model) 
trainer = L.Trainer()
trainer.fit(compiled_model)

GitHub