[ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling
CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge techniques in sparse architecture, speculative sampling and qua…
Efficient Training (including pre-training and fine-tuning) for Big Models
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
An Open-Source Framework for Prompt-Learning.
MiniCPM4: Ultra-Efficient LLMs on End Devices, achieving 5+ speedup on typical end-side chips
You can’t perform that action at this time.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4