Showing content from https://github.com/BestAnHongjun/LMDeploy-Jetson below:
BestAnHongjun/LMDeploy-Jetson: Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.
LMDeploy-Jetson Community
Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.
[中文] | [English]
This project focuses on adapting LMDeploy for use with NVIDIA Jetson series edge computing cards, facilitating the implementation of InternLM series LLMs for Offline Embodied Intelligence (OEI).
- [2024/3/15] Updated suppoort for LMDeploy-v0.2.5.
- [2024/2/26] This project has been included in the LMDeploy community.
- Recruiting community managers (Contact: an.hongjun@foxmail.com)
- Recruiting benchmark testing data for more models of Jetson boards (please PR directly), such as:
- Jetson Nano
- Jetson TX2
- Jetson AGX Xavier
- Jetson Orin Nano
- Jetson AGX Orin
- Recruiting developers to create Jetson-specific whl distributions
- README optimization, etc.
- ✅:Verified and runnable
- ❌:Verified but not runnable
- ⭕️:Pending verification
Models InternLM-7B InternLM-20B InternLM2-1.8B InternLM2-7B InternLM2-20B Orin AGX(32G)
Jetpack 5.1 ✅
Mem:??/??
14.68 token/s ✅
Mem:??/??
5.82 token/s ✅
Mem:??/??
56.57 token/s ✅
Mem:??/??
14.56 token/s ✅
Mem:??/??
6.16 token/s Orin NX(16G)
Jetpack 5.1 ✅
Mem:8.6G/16G
7.39 token/s ✅
Mem:14.7G/16G
3.08 token/s ✅
Mem:5.6G/16G
22.96 token/s ✅
Mem:9.2G/16G
7.48 token/s ✅
Mem:14.8G/16G
3.19 token/s Xavier NX(8G)
Jetpack 5.1 ❌ ❌ ✅
Mem:4.35G/8G
28.36 token/s ❌ ❌
If you have more Jetson series boards, feel free to run benchmarks and submit the results via Pull Requests
(PR) to become one of the community contributors!
- Updating benchmark testing data for more models of Jetson boards.
- Creating Jetson-specific whl distributions.
- Following up on updates to the LMDeploy version.
S1.Quantize on server by W4A16
S2.Install Miniconda on Jetson
S3.Install CMake-3.29.0 on Jetson
S4.Install RapidJson on Jetson
S5.Install Pytorch-2.1.0 on Jetson
S6.Port LMDeploy-0.2.5 to Jetson
S7.Run InternLM offline on Jetson
- InternDog: Offline embodied intelligent guide dog based on the InternLM2. [Github] [Bilibili]
If this project is helpful to your work, please cite it using the following format:
@misc{2024lmdeployjetson,
title={LMDeploy-Jetson:Opening a new era of Offline Embodied Intelligence},
author={LMDeploy-Jetson Community},
url={https://github.com/BestAnHongjun/LMDeploy-Jetson},
year={2024}
}
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4