Loading...
「ツール」は右上に移動しました。
利用したサーバー: wtserver1
2いいね 184 views回再生

Manage Cloud Native LLM Workloads Across Edge and Cloud Seamlessly Using KubeE... Vivian Hu & Fei Xu

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon events in Hong Kong, China (June 10-11); Tokyo, Japan (June 16-17); Hyderabad, India (August 6-7); Atlanta, US (November 10-13). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io

Manage Cloud Native LLM Workloads Across Edge and Cloud Seamlessly Using KubeEdge and WasmEdge - Vivian Hu, Second State & Fei Xu, Huawei Cloud

LLMs moving beyond data centers to edge devices. While this migration promises reduced latency and enhanced privacy, challenges come: maintaining accuracy within limited resources, and cross-device deployment problems.

The integration of KubeEdge and WasmEdge addresses the challenge. WasmeEdge is a lightweight, portable runtime (less than 50MB) without external dependencies. The KubeEdge Sedna orchestrates the edge-cloud collaboration. It monitors inference accuracy and automatically routes requests to cloud-based models when edge processing doesn't meet accuracy thresholds.

This session will demo that small LLMs provide quick, local inference at the edge. When higher accuracy is needed, Sedna seamlessly transitions to larger models in the cloud. The inference workload is built in Rust and compiled to Wasm, enabling deployment across edge and cloud without any changes.

The solution has been implemented in production across multiple industries like aerospace and bank branches.

コメント