2026/06/05/google-releases-gemma-4-qat-checkpoints-to-shrink

Google DeepMind, 모바일·노트북 온디바이스 실행 겨냥한 Gemma 4 QAT 최적화 모델 공개

2026년 6월 5일 PM 04:18·blog.google

편집자 요약

Google DeepMind가 Gemma 4 제품군에 Quantization-Aware Training을 적용한 새 체크포인트를 공개했습니다. 이번 릴리스는 Q4_0 양자화 형식과 모바일 특화 형식을 포함하며, 모바일 형식 적용 시 Gemma 4 E2B의 메모리 사용량을 1GB까지 줄였다고 밝혔습니다.

맥락

이번 발표는 LLM 경량화 경쟁이 단순한 모델 크기 축소를 넘어, 실제 소비자 기기에서의 온디바이스 AI 실행 가능성으로 이동하고 있음을 보여줍니다. QAT 기반 접근은 PTQ 대비 품질 저하를 줄이면서 메모리와 추론 효율을 개선할 수 있어, 모바일 앱과 노트북 로컬 AI 기능 확산에 영향을 줄 가능성이 큽니다.

본문

Gemma 4 QAT models: Optimizing compression for mobile and laptop efficiency

토론

> geekhaus:~$ 다음 읽을거리?

blog.google

Google DeepMind, 모바일·노트북 온디바이스 실행 겨냥한 Gemma 4 QAT 최적화 모델 공개

편집자 요약

맥락

본문

댓글

토론

Google DeepMind, 노트북에서 구동 가능한 encoder-free multimodal 모델 Gemma 4 12B 공개

More than a decade later, the team behind N++ is back with a multiplayer sequel

Grand Theft Auto VI is warping the video game release calendar

편집자 요약

맥락

본문

댓글

토론

다음 읽을거리 추천

Google DeepMind, 노트북에서 구동 가능한 encoder-free multimodal 모델 Gemma 4 12B 공개

More than a decade later, the team behind N++ is back with a multiplayer sequel

Grand Theft Auto VI is warping the video game release calendar