Qwen 3

博客：Qwen3：思深，行速
huggingface：https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f
arxiv：https://arxiv.org/pdf/2505.09388

Kevin 吴嘉文大约 10 分钟

Whisper 音频处理小记

音频格式与编码

声音是什么？

物理层面 ：空气分子的振动 → 声压随时间变化的波。
模拟信号 ：连续的波形，既有时间连续性，也有幅度连续性。

但计算机只能处理离散的数字，所以要“采样 + 量化”成数字信号。

采样与量化（数字化的第一步）

声音本质：连续的模拟信号

Kevin 吴嘉文大约 10 分钟

Qwen 模型小记（一）

针对 2024 年左右的 Qwen 模型的一些要点记录

Qwen 1.5 系列

开源模型，官方博客 1，官方博客 2

Kevin 吴嘉文大约 19 分钟

MCP 基础概念

MCP github 主页， MCP 官方文档

MCP Server

# server.py
from mcp.server.fastmcp import FastMCP
from mcp.server.fastmcp.prompts import base

# Create an MCP server
mcp = FastMCP("Demo")


# Add an addition tool
@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b


# Add a dynamic greeting resource
@mcp.resource("greeting://{name}")
def get_greeting(name: str) -> str:
    """Get a personalized greeting"""
    return f"Hello, {name}!"

Kevin 吴嘉文大约 3 分钟

RLHF|DPO, GRPO

本文梳理了 DPO，GRPO 的主要特点、亮点以及相关资源链接。

DPO

论文：Direct preference optimization: Your language model is secretly a reward model. arXiv preprint arXiv:2305.18290, 2023

先来回顾以下 PPO，采用 PPO 的 RLHF 会经过 reward model tuning 和 Reinforcement Learning 2 个步骤：

Kevin 吴嘉文大约 4 分钟

AUTOGEN | 上手与源码分析

AUTOGEN 是一个开源平台，主要用于创建和管理自动化对话代理（agents）。这些代理可以完成多种任务，比如回答问题、执行函数，甚至与其他代理进行交互。

本文旨在介绍 Autogen 中的关键组件 Conversation Agent，并对其中的 Multi-Agent 功能实现做简单的源码分析。

参考官网文档，参考代码 Autogen 源码5a5c0f2 。

Kevin 吴嘉文大约 10 分钟

Semantic Kernel | 上手与分析

Semantic Kernel

本文对 Semantic Kernel 中的 Kernel，Plugin，KernelFunction，Semantic Memory，Planner，Services，reliability 等进行概念介绍。

1. Kernel

Kevin 吴嘉文大约 15 分钟

RLHF 基础

本文基于 HuggingFace 推出的 Reinforcement Learning Course 进行了整理，旨在记录强化学习的基础知识，为理解 RLHF（Reinforcement Learning from Human Feedback）打下基础。需要强调的是，以下内容仅涵盖强化学习的基础概念及 RLHF 基础，并非全面的强化学习教程。

Kevin 吴嘉文大约 22 分钟