Skip to content
View 4IK1d's full-sized avatar

Block or report 4IK1d

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Spot the conversation: speaker diarisation in the wild

168 17 Updated Jul 26, 2022

A toolkit to calculate speech audio quality. Not affiliated with the original authors

Python 73 6 Updated Aug 13, 2024

A 10000+ hours dataset for Chinese speech recognition

Shell 617 56 Updated Jan 9, 2026

Let AI agents browse the web. An autonomous toolkit for browser-based AI agents.

TypeScript 9,479 866 Updated Apr 2, 2026

OpenBrowser is an open-source, AI-native browser built on Chromium — a truly privacy-first alternative to ChatGPT Atlas, Perplexity Comet, and Dia.

TypeScript 56 14 Updated Feb 24, 2026

Lightpanda: the headless browser designed for AI and automation

Zig 31,482 1,393 Updated Jun 29, 2026

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Python 7,449 651 Updated Jun 28, 2026

MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed for realtime speech generation, can run direc…

Python 3,790 481 Updated Jun 2, 2026
Python 26 Updated May 22, 2026

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 2,991 303 Updated Jun 26, 2026

MOSS-Audio is an open-source foundation model for unified audio understanding, enabling speech, sound, music, captioning, QA, and reasoning in real-world scenarios.

Python 585 41 Updated Jun 2, 2026

MOSS-Speech is a true speech-to-speech large language model without text guidance.

Python 138 7 Updated Feb 13, 2026

[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation

Python 1,854 182 Updated Jun 26, 2026

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 25,346 4,929 Updated Jun 28, 2026

High-fidelity world models for general embodied intelligence, such as data engines and world simulators.

Python 1,856 76 Updated Jun 24, 2026

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.

3,090 127 Updated Jun 28, 2026

[ECCV 2026] VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model

Python 430 32 Updated May 2, 2026

Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?

Python 1,044 113 Updated Apr 3, 2026

RynnVLA-002: A Unified Vision-Language-Action and World Model

Python 1,080 64 Updated Dec 2, 2025

Official code of Motus: A Unified Latent Action World Model

Python 1,172 65 Updated Jan 5, 2026

GigaWorld-Policy: An Efficient Action-Centered World–Action Model

Python 1,294 101 Updated Apr 20, 2026

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Python 2,546 200 Updated Mar 10, 2026

Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals

Python 2,343 199 Updated Apr 19, 2026

[RSS 2026] Causal video-action world model for generalist robot control

Python 1,392 124 Updated Apr 29, 2026

A Curated List of Vision-Language-Action (VLA) and World Action Models (WAM) Research and Beyond

795 27 Updated Jun 21, 2026

A Pragmatic VLA Foundation Model

Python 1,522 159 Updated Jun 11, 2026

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 998 1,137 Updated Jul 4, 2024

Rust bindings for the Python interpreter

Rust 15,860 981 Updated Jun 26, 2026

豆瓣经典证券书籍收录并排名

Python 189 44 Updated Mar 25, 2021
Next