Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Personalize Any Characters with a Scalable Diffusion Transformer
A bitmap programming font optimized for coziness
A free/open source client and automation tool for Ragnarok Online
A simple Python Pydantic model for Honkai
Contexts Optical Compression
OCRmyPDF adds an OCR text layer to scanned PDF files
AI Agent Application Development Framework
A framework to enable multimodal models to operate a computer
Awesome multilingual OCR toolkits based on PaddlePaddle
Ark pixel font - Open source Pan-CJK pixel font
Crowdsourcing platform for full text transcription and tagging
Official inference repo for FLUX.2 models
Industrial-level controllable zero-shot text-to-speech system
pytablewriter is a Python library to write a table in various formats
A collection of cv and resume templates written in LaTeX
SOTA Open Source TTS
A Powerful Native Multimodal Model for Image Generation
A ranked list of awesome machine learning Python libraries
Iconic fonts in PyQt and PySide applications
Official code for Style Aligned Image Generation via Shared Attention
A deep learning toolkit for Text-to-Speech, battle-tested in research
Converts text to speech in realtime
Towards Human-Level Text-to-Speech through Style Diffusion
ktrain is a Python library that makes deep learning AI more accessible