Skip to content
View zdy023's full-sized avatar

Block or report zdy023

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKT…

Python 3,870 843 Updated May 29, 2022

Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"

Python 199 18 Updated Apr 17, 2025

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Python 811 104 Updated Feb 3, 2025

Collection of reinforcement learning algorithms

Python 2,841 565 Updated Jun 17, 2024

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.

Python 121 12 Updated Feb 21, 2024

Paper collections of the continuous effort start from World Models.

191 6 Updated Jul 6, 2024

Large Language Model Text Generation Inference

Python 10,713 1,248 Updated Dec 19, 2025

Source code for the paper "Empowering LLM to use Smartphone for Intelligent Task Automation"

Python 431 61 Updated Mar 22, 2024

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

Python 59 9 Updated Feb 29, 2024

[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild

Python 4,649 513 Updated Nov 18, 2024

redroid (Remote-Android) is a multi-arch, GPU enabled, Android in Cloud solution. Track issues / docs here

Shell 5,817 409 Updated Jun 29, 2025

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Python 1,471 106 Updated Oct 31, 2023

FireAct: Toward Language Agent Fine-tuning

Python 287 22 Updated Oct 22, 2023

ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.

Scala 322 33 Updated Dec 3, 2025

[ICLR 2024 Spotlight] Text2Reward: Reward Shaping with Language Models for Reinforcement Learning

Jupyter Notebook 192 12 Updated Dec 17, 2024

Vim motions on speed!

Vim Script 7,707 366 Updated Feb 5, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,713 449 Updated May 29, 2024

A guidance language for controlling large language models.

Jupyter Notebook 21,065 1,130 Updated Dec 17, 2025

AgentSims is an easy-to-use infrastructure for researchers from all disciplines to test the specific capacities they are interested in.

Python 917 117 Updated Nov 18, 2023

Clean PyTorch implementations of imitation and reward learning algorithms

Python 1,665 293 Updated Jan 7, 2025

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Python 1,264 205 Updated Nov 26, 2025
Python 121 36 Updated Jul 10, 2025

ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems

Python 464 138 Updated Jun 17, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,100 4,672 Updated Dec 24, 2025

GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

Python 7,681 606 Updated Jul 25, 2023

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 41,225 5,220 Updated Jun 27, 2024

AI绘画资料合集(包含国内外可使用平台、使用教程、参数教程、部署教程、业界新闻等等) Stable diffusion、AnimateDiff、Stable Cascade 、Stable SDXL Turbo

11,708 950 Updated Aug 14, 2024

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 53,030 6,184 Updated Sep 18, 2024

Source code and data for ACL 2019 Long Paper ``Semantic Parsing with Dual Learning".

Python 23 7 Updated Feb 21, 2021

Source code and data for the journal ``Dual learning for semi-supervised natural language understanding" in TASLP 2020.

Python 9 Updated Apr 1, 2021
Next