Morgan Funtowicz's picture

Morgan Funtowicz PRO

mfuntowicz

huggingface

·

https://github.com/mfuntowicz

AI & ML interests

Model inference low-level optimization, hardware affinity and large-scale distributed training.

Recent Activity

upvoted an article 2 months ago

Mixture of Experts (MoEs) in Transformers

upvoted a collection 2 months ago

updated a dataset 3 months ago

mfuntowicz/pilot

View all activity

Organizations

published an article 12 months ago

Article

Blazingly fast whisper transcriptions with Inference Endpoints

+4

mfuntowicz, freddyaboulton, Steveeeeeeen, reach-vb, erikkaum, michellehbn

•

May 13, 2025

• 82

published an article over 1 year ago

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

mfuntowicz, hlarcher

•

Jan 16, 2025

• 76

published an article over 1 year ago

Article

Introducing the AMD 5th Gen EPYC™ CPU

mohitsha, mfuntowicz

•

Oct 10, 2024

• 7

published an article almost 2 years ago

Article

Hugging Face on AMD Instinct MI300 GPU

+2

fxmarty, mohitsha, seungrokj, mfuntowicz

•

May 21, 2024

• 16

published an article about 2 years ago

Article

CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG

+4

peterizsak, mber, danf, echarlaix, mfuntowicz, moshew

•

Mar 15, 2024

• 14

published an article over 2 years ago

Article

Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive

+1

sschoenmeyer, tlwu, mfuntowicz

•

Jan 15, 2024

• 7

published an article over 2 years ago

Article

AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU

+4

fxmarty, IlyasMoutawwakil, mohitsha, echarlaix, seungrokj, mfuntowicz

•

Dec 5, 2023

• 4

published an article over 2 years ago

Article

Optimum-NVIDIA Unlocking blazingly fast LLM inference in just 1 line of code

laikh-nvidia, mfuntowicz

•

Dec 5, 2023

• 5

published an article over 2 years ago

Article

Accelerating over 130,000 Hugging Face models with ONNX Runtime

sschoenmeyer, mfuntowicz

•

Oct 4, 2023

• 1

published an article over 2 years ago

Article

Accelerating over 130,000 Hugging Face models with ONNX Runtime

sschoenmeyer, mfuntowicz

•

Oct 4, 2023

• 1

published an article over 4 years ago

Article

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

+1

philschmid, jeffboudier, mfuntowicz

•

Jan 13, 2022

• 3

published an article over 4 years ago

Article

Scaling up BERT-like model Inference on modern CPU - Part 2

+2

echarlaix, jeffboudier, mfuntowicz, michaelbenayoun

•

Nov 4, 2021

• 1

published an article over 4 years ago

Article

Introducing Optimum: The Optimization Toolkit for Transformers at Scale

+2

mfuntowicz, echarlaix, michaelbenayoun, jeffboudier

•

Sep 14, 2021

• 2

published an article about 5 years ago

Article

Scaling-up BERT Inference on CPU (Part 1)

mfuntowicz

•

Apr 20, 2021

• 5

published an article about 5 years ago

Article

Scaling-up BERT Inference on CPU (Part 1)

mfuntowicz

•

Apr 20, 2021

• 5