Files
pikasTech-unidesk/docs/issue/baidu-netdisk-user-service.md
T
Codex a242e3e3ec feat: expand scheduling, notifications, and queue runtime
- add scheduled task plumbing across backend core, CLI, and frontend surfaces

- add frontend notification UI and keep service pages using the repaired shared stylesheet

- refactor code queue runtime and update baidu netdisk/service integration docs
2026-05-13 08:43:43 +00:00

17 KiB

Baidu Netdisk User Service Research

Date: 2026-05-11

Implementation note: the first UniDesk-integrated version now lives in this repo as a main-server private user service, with backend source at src/components/microservices/baidu-netdisk, Compose service baidu-netdisk, config id baidu-netdisk, and frontend page src/components/frontend/src/baidu-netdisk.tsx. It keeps the recommended v1 boundary: JSON control API through UniDesk microservice proxy, OAuth Device Code login, PostgreSQL-backed encrypted token/task state, and staging-directory upload/download jobs instead of browser byte streaming.

Environment setup note: Baidu app credentials are account-owned secrets and must be supplied out of band. The local encryption key and exact host configuration steps are documented in docs/issue/baidu-netdisk-env-setup.md.

Goal

Create a UniDesk user service that connects to Baidu Netdisk in a containerized way and exposes file storage operations such as login, browse, upload, download, move, rename and delete. The user-facing login should feel similar to the ClaudeQQ page: the UniDesk frontend shows a login card/QR, backend state, recent transfer jobs and explicit raw JSON buttons, while the business backend remains private behind the UniDesk microservice proxy.

Build a small pure-backend user service named baidu-netdisk and integrate it as a UniDesk user service. Use Baidu Netdisk official OAuth/API directly in the backend for the first version; optionally add AList or CLI tools later as transfer workers, not as the primary auth or frontend.

Why this route:

  • The official API supports OAuth scopes basic,netdisk, QR-like login flows, file listing, metadata/dlink retrieval, multipart upload and file management.
  • A ClaudeQQ-like login UI maps well to Baidu's Device Code flow: backend requests a device_code, frontend displays qrcode_url and user_code, backend polls token status at the documented interval.
  • The UniDesk proxy currently handles text JSON request/response bodies, with a 1 MiB incoming body limit and an 8 MiB response body limit. Large file bytes should therefore not be pushed through /api/microservices/*/proxy in v1.
  • Keeping tokens and jobs in PostgreSQL gives restart recovery and avoids storing credentials in local JSON files.

Important UniDesk Constraint

Current microservice.http is suitable for control-plane JSON, not bulk binary file transfer:

  • backend-core reads non-GET bodies using req.text() and rejects bodies larger than 1 MiB.
  • provider-gateway reads upstream responses using response.text() and caps returned body text at 8 MiB.
  • The proxy only forwards content-type, not range, content-disposition or arbitrary binary headers.

So v1 should expose APIs such as POST /api/transfers/upload-from-path and POST /api/transfers/download-to-path, where the backend container reads/writes files from a mounted staging directory. If browser-to-local uploads/downloads are required, add a separate binary streaming capability to backend-core/provider-gateway in a future change set. That future gateway change would trigger the provider-gateway version-bump rule.

Baidu Netdisk Access Model

Login

Use Device Code as the default container login:

  1. POST /api/auth/device/start calls GET https://openapi.baidu.com/oauth/2.0/device/code?response_type=device_code&client_id=<appKey>&scope=basic,netdisk.
  2. Backend stores device_code, user_code, verification_url, qrcode_url, expires_in and interval in PostgreSQL.
  3. UniDesk frontend displays the QR code and user code.
  4. GET /api/auth/device/status?sessionId=... returns current login state; backend polls Baidu token endpoint no more frequently than the returned interval and at least 5 seconds.
  5. On success, backend stores access_token, refresh_token, expires_in, scope, account metadata and refresh timestamps.

Authorization Code mode is also possible and can pass qrcode=1, but it requires a redirect URI and callback endpoint. Device Code is simpler for a private container behind UniDesk because the browser only needs to display a QR URL and poll backend state.

Token handling requirements:

  • Access token lifetime is 30 days in the Baidu docs.
  • Refresh token is long-lived but the Netdisk docs say it is single-use: after a refresh, store the new refresh token immediately and never retry with the old one in a loop.
  • Use a PostgreSQL row lock or advisory lock around refresh to avoid two workers spending the same refresh token concurrently.
  • Never log tokens or dlinks; health/status endpoints should only expose redacted auth state.

Scope and Remote Root

Use scope=basic,netdisk. Official docs for download still describe third-party app data under /apps/<productName> and visible to users as /我的应用数据/<productName>, but the file-list docs also define dir as an absolute path defaulting to /. On 2026-05-13, the current UniDesk Baidu application and authorized account were tested directly against the official APIs: listing /, uploading a tiny temporary file to /unidesk-root-probe-*.txt, obtaining its dlink, downloading it back, verifying MD5, and deleting it all succeeded with errno=0. Therefore UniDesk now defaults UNIDESK_BAIDU_NETDISK_APP_ROOT to / and treats it as the remote working root. Operators can still set UNIDESK_BAIDU_NETDISK_APP_ROOT=/apps/<name> to re-enable an app-folder sandbox.

Browse and Metadata

Useful official endpoints:

  • User info: GET /rest/2.0/xpan/nas?method=uinfo.
  • Quota: GET /api/quota.
  • List directory: GET /rest/2.0/xpan/file?method=list with dir, paging and sort parameters.
  • File metadata and download URL: GET /rest/2.0/xpan/multimedia?method=filemetas&fsids=[...]&dlink=1.

Upload

Official multipart upload sequence:

  1. Compute full-file MD5 and per-part MD5 list. For normal users, part size is fixed at 4 MiB. Docs list higher part and total file limits for paid membership tiers.
  2. POST /rest/2.0/xpan/file?method=precreate with path, size, isdir=0, autoinit=1, rtype and block_list.
  3. GET /rest/2.0/pcs/file?method=locateupload&appid=250528&uploadid=...&upload_version=2.0 and choose an HTTPS upload domain from servers.
  4. POST https://<upload-domain>/rest/2.0/pcs/superfile2?method=upload&type=tmpfile&path=...&uploadid=...&partseq=N as multipart form file=@chunk for each required part.
  5. POST /rest/2.0/xpan/file?method=create with the same path, size, isdir, rtype, uploadid and ordered block_list.

For small files, the official single-step upload endpoint can be a convenience path, but the multipart path is enough for all sizes and gives progress/resume semantics.

Download

Official download sequence:

  1. Get fs_id from list/search.
  2. Request file metadata with dlink=1.
  3. Fetch dlink&access_token=<token> using User-Agent: pan.baidu.com.
  4. Respect 302 redirects, Range for resume, and the documented 8-hour dlink lifetime.

Because browsers cannot safely set the required User-Agent and current UniDesk proxy cannot stream large binary responses, the backend should download to staging storage in v1. Browser download can be offered later via a binary streaming proxy endpoint or a backend-owned short-lived internal file endpoint if the gateway/core are upgraded to stream bytes safely.

Third-Party Technology Options

  1. Official API in custom backend (recommended for v1)

    • Best fit for UniDesk security and UI conventions.
    • Precise control over token rotation, path sandboxing, transfer jobs and PostgreSQL persistence.
    • Needs implementation of multipart upload/download resume, but the API flow is straightforward.
  2. AList as a sidecar or reference implementation

    • AList already has a Baidu Netdisk driver and supports storage mounting through its own server.
    • Useful if we want WebDAV-like access or want to validate behavior quickly.
    • Treat it as an internal sidecar behind the baidu-netdisk backend; do not expose AList WebUI as the UniDesk frontend.
    • Watch license/upgrade/security posture before embedding it into production.
  3. bypy as a Python worker

    • Good for app-folder upload/download automation and quick scripts.
    • Can run in a worker container for batch operations if we accept Python dependency and app-folder assumptions.
    • Less ideal as the primary service API because UniDesk still needs its own auth state, job model and structured frontend.
  4. BaiduPCS-Go as a worker

    • Strong CLI for batch transfers and resume behavior.
    • Could be invoked from the service for jobs after controlled login/config injection.
    • Avoid making CLI config files the credential authority; PostgreSQL should remain authoritative.
  5. Unofficial or cracked web APIs

    • Avoid. They are unstable, hard to validate, and may violate Baidu terms or trigger account risk controls.

Proposed User Service Contract

Backend APIs

Expose a pure JSON control API first:

  • GET /health: service, storage, auth, queue and Baidu API reachability summary.
  • GET /api/auth/status: redacted configured/logged-in/auth-session summary.
  • POST /api/auth/device/start: start QR/device login.
  • GET /api/auth/device/status?sessionId=...: login state and QR metadata. OAuth authorization_pending and slow_down responses are normal pending states and must not be surfaced as frontend HTTP errors.
  • POST /api/auth/refresh: force token refresh for diagnostics.
  • POST /api/auth/logout: revoke local tokens and stop jobs.
  • GET /api/account: user info and quota.
  • GET /api/files?dir=/&start=0&limit=100: directory listing under the configured remote working root.
  • GET /api/files/meta?fsids=...&dlink=0|1: metadata, optionally dlink redacted by default.
  • POST /api/folders: create folder through method=create&isdir=1.
  • POST /api/files/manage: copy/move/rename/delete using method=filemanager.
  • POST /api/transfers/upload-from-path: read a file inside the mounted staging directory and upload it to Baidu.
  • POST /api/transfers/download-to-path: download a Baidu file to the staging directory.
  • POST /api/self-test: create a tiny staging fixture, upload it, verify it appears in /api/files, download it back to staging and compare MD5.
  • GET /api/transfers: list transfer jobs.
  • GET /api/transfers/{id}: job detail, progress, retry and last error.
  • POST /api/transfers/{id}/cancel and POST /api/transfers/{id}/retry.
  • GET /logs: recent structured service logs with tokens/dlinks redacted.

If a future binary proxy is added, extend with:

  • POST /api/uploads/sessions + chunk PUT/POST endpoints.
  • GET /api/downloads/{jobId}/stream with Range support.

PostgreSQL Tables

Minimum schema:

  • baidu_netdisk_accounts(id, baidu_uid, username, avatar_url, vip_type, root_path, created_at, updated_at).
  • baidu_netdisk_tokens(account_id, access_token_ciphertext, refresh_token_ciphertext, expires_at, scope, generation, last_refresh_at).
  • baidu_netdisk_auth_sessions(id, device_code_ciphertext, user_code, verification_url, qrcode_url, expires_at, poll_interval_seconds, status, error, created_at, updated_at).
  • baidu_netdisk_transfer_jobs(id, account_id, direction, status, local_path, remote_path, fs_id, size_bytes, bytes_done, part_size, block_list_json, uploadid, retry_count, error, created_at, updated_at).
  • baidu_netdisk_transfer_events(id, job_id, level, message, data_json, created_at).

Token encryption key should come from an environment variable such as BAIDU_NETDISK_TOKEN_KEY; no secrets should be committed.

Container and Deployment

If deployed on D601, use the normal compute-node user-service boundary:

{
  "id": "baidu-netdisk",
  "name": "Baidu Netdisk",
  "providerId": "D601",
  "description": "Containerized Baidu Netdisk storage gateway with QR/device login and transfer jobs.",
  "repository": {
    "url": "https://github.com/pikasTech/baidu-netdisk-unidesk",
    "commitId": "<commit>",
    "dockerfile": "Dockerfile",
    "composeFile": "docker-compose.unidesk.yml",
    "composeService": "baidu-netdisk",
    "containerName": "baidu-netdisk-backend"
  },
  "backend": {
    "nodeBaseUrl": "http://host.docker.internal:3295",
    "nodeBindHost": "127.0.0.1",
    "nodePort": 3295,
    "proxyMode": "provider-gateway-http",
    "frontendOnly": true,
    "public": false,
    "allowedMethods": ["GET", "HEAD", "POST", "DELETE"],
    "allowedPathPrefixes": ["/health", "/logs", "/api/"],
    "healthPath": "/health",
    "timeoutMs": 30000
  },
  "development": {
    "providerId": "D601",
    "sshPassthrough": true,
    "worktreePath": "/home/ubuntu/baidu-netdisk-unidesk"
  },
  "frontend": {
    "route": "/apps/baidu-netdisk",
    "integrated": true
  }
}

If deployed on main server, use a Compose service name such as http://baidu-netdisk:4244 and add a root docker-compose.yml service. Main-server deployment is justified only if UniDesk needs central storage on the main server; otherwise D601 or another compute node is cleaner.

Frontend Page

Add a dedicated src/components/frontend/src/baidu-netdisk.tsx page and route tab:

  • Login panel: QR image from qrcode_url, user code, expires timer, poll status, refresh QR and logout buttons.
  • Account cards: username, UID, quota used/total, VIP state, remote working root path.
  • File browser: breadcrumb rooted at the configured working root, now / by default, paginated table, folder creation, rename/delete/move controls.
  • Transfer panel: upload-from-path form, download-to-path form, job rows, progress bars, speed, ETA, retry/cancel buttons.
  • Safety text: private backend mapping, token storage redacted, no direct public Baidu token exposure.
  • Raw JSON only behind explicit buttons, following existing user-service conventions.

Acceptance Plan

Focused checks after implementation:

  • bun scripts/cli.ts microservice list shows baidu-netdisk, private backend, target provider and container summary.
  • bun scripts/cli.ts microservice health baidu-netdisk returns ok=true, service=baidu-netdisk, storage=postgres and redacted auth state.
  • bun scripts/cli.ts microservice proxy baidu-netdisk /api/auth/device/start --method POST returns a login session with QR/user-code metadata but no token.
  • After manual QR authorization, /api/account and /api/files?dir=<root> return user/quota/list data.
  • Upload/download tests can use bun scripts/cli.ts microservice proxy baidu-netdisk /api/self-test --method POST --raw; the response must include a remote path in the working root, an fsId, succeeded upload/download jobs, and matching expectedMd5/downloadedMd5.
  • Public port probes must fail for the service port; frontend access only through UniDesk.
  • Playwright verifies /app/baidu-netdisk/ renders shell, login card, account/quota, file browser, transfer panel and no naked JSON.

Full regression should later add microservice:catalog-baidu-netdisk, microservice:baidu-netdisk-health, login-state checks and frontend:baidu-netdisk-integrated-visible to scripts/src/e2e.ts.

Open Risks

  • Baidu app review/permissions may block upload/download until the app is approved and scoped correctly.
  • Device Code QR expires quickly; frontend needs clear countdown and refresh behavior.
  • Refresh token is single-use per Netdisk docs; a race can force re-login if token rotation is not serialized.
  • Browser direct download is not a v1 fit because official dlink download requires User-Agent: pan.baidu.com and current UniDesk proxy cannot stream large binary safely.
  • Large upload/download jobs need resumable local job records and cleanup of temporary chunks/staged files.
  • Using AList/BaiduPCS-Go/byPy may introduce third-party license and maintenance risk; keep their configs/token caches derived from UniDesk PostgreSQL, not authoritative.

Sources