На 22 марта 2026 уже неточно объяснять Edge AI как просто список WebLLM, CoreML, TensorFlow Lite и Raspberry Pi. Current on-device stack изменился:
Google AI Edge, MediaPipe generative AI tasks и LLM Inference API;ExecuTorch;WebGPU and Transformers.js v4;ONNX Runtime GenAI now matters as a generate-loop runtime, not just classic ONNX inference;Поэтому в 2026 Edge AI полезнее понимать как on-device inference architecture, а не как набор разрозненных SDK.
Edge AI - это когда модель работает прямо на устройстве пользователя или рядом с ним:WebLLM / CoreML / TensorFlow Lite / Jetson уже слишком плоская. Current edge stack лучше объяснять через actual deployment lanes: Google AI Edge + MediaPipe, ExecuTorch, Core ML, ONNX Runtime GenAI, Transformers.js v4 и browser WebGPU/WebNN direction.Current edge discussion уже не сводится к вопросу "можно ли запустить маленькую модель на устройстве".
Правильные вопросы сегодня:
То есть edge today - это system design, not just model compression.
Google current edge site now explicitly puts generative AI tasks front and center.
Official story:
Это important because older MediaPipe-only framing now looks too narrow. Current mental model should be:
Google AI Edge = umbrella;MediaPipe LLM Inference = one of the main practical on-device lanes.Official Android and iOS guides for LLM Inference make the current mobile lane very concrete.
Important facts:
This matters because MediaPipe today is no longer just about vision demos. It is a serious practical route for mobile generative AI.
Official Apple docs still make Core ML the foundation for on-device model integration.
The useful current framing is:
Core ML is not "one more inference library";So when teams say "Apple edge AI", what they usually really mean is:
ExecuTorch docs now position it clearly as PyTorch’s solution for efficient AI inference on edge devices:
This is a big shift from older summaries that often still mention PyTorch Mobile more prominently.
Current practical implication:
ExecuTorch is now the most current official PyTorch lane to consider.Official docs and the v4 preview blog show that browser AI has matured:
device: 'webgpu' support;This matters because browser AI in 2026 is no longer just "cute local inference demo". It is increasingly practical for:
Even with better runtimes, browser AI still lives under:
So browser edge is strongest when:
Official ONNX Runtime GenAI docs describe a preview generate() API with:
This is important because old ONNX framing was mostly:
Current GenAI framing is richer:
That makes it more relevant for real edge and embedded generative applications.
One common mistake is to compare edge frameworks by model brand.
Current better comparison is by deployment lane:
MediaPipe LLM InferenceCore MLExecuTorchTransformers.jsONNX Runtime GenAIThis is much closer to how real engineering decisions are made.
Current edge AI is usually chosen for four reasons:
Not every app needs all four. But when at least two of them matter strongly, edge becomes very compelling.
Current sweet spots:
Edge still struggles more when:
That is why many mature 2026 systems are hybrid:
Core ML, MediaPipe, or ExecuTorch;ONNX Runtime GenAI, especially if model conversion/export already sits in your pipeline.import { pipeline } from "@huggingface/transformers";
const pipe = await pipeline("sentiment-analysis", undefined, {
device: "webgpu",
});
1. Pick a supported or convertible on-device model.
2. Validate memory footprint on target devices.
3. Benchmark latency and battery impact.
4. Add fallback or smaller model path for weaker hardware.
1. Что точнее всего описывает Edge AI в 2026?
2. Когда `ExecuTorch` особенно уместен?
3. Почему browser AI в 2026 уже нельзя списывать как игрушку?