linjinyu/MNN: MNN is a lightweight deep neural network inference engine.此仓库是为了提升国内下载速度的镜像仓库，每日同步一次。原始仓库： https://github.com/alibaba/MNN

MNN is a lightweight deep neural network inference engine.此仓库是为了提升国内下载速度的镜像仓库，每日同步一次。原始仓库： https://github.com/alibaba/MNN

2081 Commitit

17 Haarat

104 Julkaisut

jxt1234 80a917a1b0 Merge pull request #3723 from jules-ai/feature/specify-cpu		2 viikkoa sitten
.github	d578d3152a [CI:Refactor] close push/pr of internal pymnn release.	5 kuukautta sitten
3rd_party	edf1b3555c update cmake minimium support version	1 kuukausi sitten
apps	0c47fe619c Fix sherpa-mnn-jni bugs	2 viikkoa sitten
backupcode	a019d971ad [MNN:Sync] Sync Internal 3.1.4.	2 kuukautta sitten
benchmark	3dc5d452a5 fix: python.lang.security.audit.eval-detected.eval-detected-benchmark-scripts-tvm-ios_bert.py	3 viikkoa sitten
ciscripts	d6795ad031 Github release 1.1.0	4 vuotta sitten
cmake	cf39606610 update cmake minimium support version of dnnl	1 kuukausi sitten
codegen	1effb0c9e5 MNN:Sync: Sync Internal 2.9.5	10 kuukautta sitten
demo	d9a6ce3ac1 [MNN:Sync] Sync Internal 3.1.0.	5 kuukautta sitten
doc	7a3e650380 Doc:Bugfix: Update dingding	8 kuukautta sitten
docs	c018eacc00 refactor cpuids setting from BackendConfig to HintMode	2 viikkoa sitten
express	c018eacc00 refactor cpuids setting from BackendConfig to HintMode	2 viikkoa sitten
include	c018eacc00 refactor cpuids setting from BackendConfig to HintMode	2 viikkoa sitten
package_scripts	e5e7fccd99 MNN:Sync: Sync Internal 3.2.1	1 kuukausi sitten
project	6fbbfda5ec MNN:Bugfix: Fix compile bug for ios compile and llm_demo crash for empty	3 viikkoa sitten
pymnn	00c796ec29 fix: CVE-2021-41495-pymnn-test-playground-requirements.txt	3 viikkoa sitten
resource	ffd43373b7 Update flt_input.txt	1 vuosi sitten
schema	d9891d9424 update new MNN_gennerated.h	1 kuukausi sitten
source	80a917a1b0 Merge pull request #3723 from jules-ai/feature/specify-cpu	2 viikkoa sitten
test	110ae7f645 Implement cpuMask-based near-singleton global Thread Pool	2 viikkoa sitten
tools	c018eacc00 refactor cpuids setting from BackendConfig to HintMode	2 viikkoa sitten
transformers	38e5c50328 LLM:Bugfix: Fix compile bug	2 viikkoa sitten
.gitignore	bd36a3f749 [MNN:Sync] Sync internal:	2 kuukautta sitten
.readthedocs.yaml	6f9927d481 MNN:Docs: Fix .readthedocs.yaml, add req	1 vuosi sitten
.travis.yml	7bc6f4f650 Updated travis ci build matrix. Added ./ciscripts/Linux/CL_OMP_Vulkan.sh.	5 vuotta sitten
.trivyignore	00c796ec29 fix: CVE-2021-41495-pymnn-test-playground-requirements.txt	3 viikkoa sitten
CMakeLists.txt	df765eba0c fix iOS framework building issue in CMakeLists.txt	1 kuukausi sitten
CONTRIBUTING.md	ba4819bfb4 added CONTRIBUTING.md	1 vuosi sitten
LICENSE.txt	b4c4318e05 [MNN:Doc] add license.	3 kuukautta sitten
MNN.podspec	c70ecef660 [MNN:Sync] Sync Internal Gitlab: 2.5.1	2 vuotta sitten
MNN.sln	860fceb3ab MNN:Sync: Sync Internal 2.9.6	9 kuukautta sitten
MNN_Render.podspec	3b978d9d16 [MNN:Sync] Sync Internal 2.8.1	1 vuosi sitten
README.md	e0d1ec011b rename TaoAvatar	1 kuukausi sitten
README_CN.md	e5e7fccd99 MNN:Sync: Sync Internal 3.2.1	1 kuukausi sitten
README_JP.md	e5e7fccd99 MNN:Sync: Sync Internal 3.2.1	1 kuukausi sitten
docker_release.sh	970b63f3b4 [MNN:Sync] Sync Internal 2.8.2	1 vuosi sitten
docker_run.sh	03c7b5347b [MNN:Sync] Sync internal Gitlab	3 vuotta sitten
release.sh	970b63f3b4 [MNN:Sync] Sync Internal 2.8.2	1 vuosi sitten
test.bat	0c718e552b [Sync] Sync internal Gitlab	3 vuotta sitten
test.ps1	bdf15442f4 [MNN:Sync] Sync Internal 2.7.1	1 vuosi sitten
test.sh	766815282f MNN:Sync: Sync Internal 3.0.4	6 kuukautta sitten

News 🔥

[2025/06/11] New App MNN-TaoAvatar released, you can talk with 3DAvatar offline with LLM, ASR, TTS, A2BS and NNR models all run local on your device!! MNN-TaoAvatar
[2025/05/30] MNN Chat app support DeepSeek-R1-0528-Qwen3,Qwen3-30B-A3B, SmoVLM and FastVLM MNN Chat App.
[2025/05/12] android app support qwen2.5 omni 3b and 7b MNN Chat App.

History News

[2025/04/30] android app support qwen3 and dark mode MNN Chat App.
[2025/02/18] iOS multimodal LLM App is released MNN LLM iOS.
[2025/02/11] android app support for deepseek r1 1.5b.
[2025/01/23] We released our full multimodal LLM Android App:MNN-LLM-Android. including text-to-text, image-to-text, audio-to-text, and text-to-image generation.

Intro

MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. At present, MNN has been integrated into more than 30 apps of Alibaba Inc, such as Taobao, Tmall, Youku, DingTalk, Xianyu, etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity distribution, security risk control. In addition, MNN is also used on embedded devices, such as IoT.

MNN-LLM is a large language model runtime solution developed based on the MNN engine. The mission of this project is to deploy LLM models locally on everyone's platforms(Mobile Phone/PC/IOT). It supports popular large language models such as Qianwen, Baichuan, Zhipu, LLAMA, and others. MNN-LLM User guide

MNN-Diffusion is a stable diffusion model runtime solution developed based on the MNN engine. The mission of this project is to deploy stable diffusion models locally on everyone's platforms. MNN-Diffusion User guide

Inside Alibaba, MNN works as the basic module of the compute container in the Walle System, the first end-to-end, general-purpose, and large-scale production system for device-cloud collaborative machine learning, which has been published in the top system conference OSDI’22. The key design principles of MNN and the extensive benchmark testing results (vs. TensorFlow, TensorFlow Lite, PyTorch, PyTorch Mobile, TVM) can be found in the OSDI paper. The scripts and instructions for benchmark testing are put in the path “/benchmark”. If MNN or the design of Walle helps your research or production use, please cite our OSDI paper as follows:

@inproceedings {proc:osdi22:walle,
    author = {Chengfei Lv and Chaoyue Niu and Renjie Gu and Xiaotang Jiang and Zhaode Wang and Bin Liu and Ziqi Wu and Qiulin Yao and Congyu Huang and Panos Huang and Tao Huang and Hui Shu and Jinde Song and Bin Zou and Peng Lan and Guohuan Xu and Fei Wu and Shaojie Tang and Fan Wu and Guihai Chen},
    title = {Walle: An {End-to-End}, {General-Purpose}, and {Large-Scale} Production System for {Device-Cloud} Collaborative Machine Learning},
    booktitle = {16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22)},
    year = {2022},
    isbn = {978-1-939133-28-1},
    address = {Carlsbad, CA},
    pages = {249--265},
    url = {https://www.usenix.org/conference/osdi22/presentation/lv},
    publisher = {USENIX Association},
    month = jul,
}

Documentation and Workbench

MNN's docs are in place in Read the docs.

You can also read docs/README to build docs's html.

MNN Workbench could be downloaded from MNN's homepage, which provides pretrained models, visualized training tools, and one-click deployment of models to devices.

Key Features

Lightweight

Optimized for devices, no dependencies, can be easily deployed to mobile devices and a variety of embedded devices.
iOS platform: static library size will full option for armv7+arm64 platforms is about 12MB, size increase of linked executables is about 2M.
Android platform: core so size is about 800KB (armv7a - c++_shared).
Using MNN_BUILD_MINI can reduce package size by about 25%, with a limit of fixed model input size
Support FP16 / Int8 quantize, can reduce model size 50%-70%

Versatility

Supports Tensorflow, Caffe, ONNX,Torchscripts and supports common neural networks such as CNN, RNN, GAN, Transformer.
Supports AI model with multi-inputs or multi-outputs, every kind of dimension format, dynamic inputs, controlflow.
MNN supports approximate full OPs used for the AI Model. The converter supports 178 Tensorflow OPs, 52 Caffe OPs, 163 Torchscripts OPs, 158 ONNX OPs.
Supports iOS 8.0+, Android 4.3+, and embedded devices with POSIX interface.
Supports hybrid computing on multiple devices. Currently supports CPU and GPU.

High performance

Implements core computing with lots of optimized assembly code to make full use of the ARM / x64 CPU.
Use Metal / OpenCL / Vulkan to support GPU inference on mobile.
Use CUDA and tensorcore to support NVIDIA GPU for better performance
Convolution and transposition convolution algorithms are efficient and stable. The Winograd convolution algorithm is widely used to better symmetric convolutions such as 3x3,4x4,5x5,6x6,7x7.
Twice speed increase for the new architecture ARM v8.2 with FP16 half-precision calculation support. 2.5 faster to use sdot for ARM v8.2 and VNNI.

Ease of use

Support use MNN's OP to do numerical calculating like numpy.
Support lightweight image process module like OpenCV, which is only 100k.
Support build model and train it on PC / mobile.
MNN Python API helps ML engineers to easily use MNN to infer, train, and process images, without dipping their toes in C++ code.

The Architecture / Precision MNN supported is shown below:

S ：Support and work well, deeply optimized, recommend to use
A ：Support and work well, can use
B ：Support but has bug or not optimized, no recommend to use
C ：Not Support

Architecture / Precision		Normal	FP16	BF16	Int8
CPU	Native	B	C	B	B
	x86/x64-SSE4.1	A	C	C	A
	x86/x64-AVX2	S	C	C	A
	x86/x64-AVX512	S	C	C	S
	ARMv7a	S	S (ARMv8.2)	S	S
	ARMv8	S	S (ARMv8.2)	S(ARMv8.6)	S
GPU	OpenCL	A	S	C	S
	Vulkan	A	A	C	A
	Metal	A	S	C	S
	CUDA	A	S	C	A
NPU	CoreML	A	C	C	C
	HIAI	A	C	C	C
	NNAPI	B	B	C	B
	QNN	C	B	C	C

Tools

Base on MNN (Tensor compute engine), we provided a series of tools for inference, train and general computation.

MNN-Converter: Convert other models to MNN models for inference, such as Tensorflow(lite), Caffe, ONNX, Torchscripts. And do graph optimization to reduce computation.
MNN-Compress: Compress model to reduce size and increase performance / speed
MNN-Express: Support model with controlflow, use MNN's OP to do general-purpose computing.
MNN-CV: An OpenCV-like library, but based on MNN and then much more lightweight.
MNN-Train: Support train MNN model.

How to Discuss and Get Help From the MNN Community

The group discussions are predominantly Chinese. But we welcome and will help English speakers.

Dingtalk discussion groups:

Group #1 (Full): 23329087

Group #2 (Full): 23350225

Group #3: QR code:

Historical Paper

The preliminary version of MNN, as mobile inference engine and with the focus on manual optimization, has also been published in MLSys 2020. Please cite the paper, if MNN previously helped your research:

@inproceedings{alibaba2020mnn,
  author = {Jiang, Xiaotang and Wang, Huan and Chen, Yiliu and Wu, Ziqi and Wang, Lichuan and Zou, Bin and Yang, Yafeng and Cui, Zongyang and Cai, Yu and Yu, Tianhang and Lv, Chengfei and Wu, Zhihua},
  title = {MNN: A Universal and Efficient Inference Engine},
  booktitle = {MLSys},
  year = {2020}
}

License

Apache 2.0

Acknowledgement

MNN participants: Taobao Technology Department, Search Engineering Team, DAMO Team, Youku and other Alibaba Group employees.

MNN refers to the following projects:

README.md