Schema-Align: A lightweight skeleton unifier with kinematic constraints for cross-dataset human action recognition

Main Article Content

Roman V. Kovalevych
Mykhaylo V. Lobachev

Abstract

Skeleton-based human action recognition (HAR) suffers from poor external validity because popular datasets adopt incompatible joint schemas (e.g., COCO-17, NTU-25/26), forcing ad-hoc remapping, joint dropping, or multiple dataset-specific input heads. We present Schema-Align, a lightweight, model-agnostic unifier that canonicalizes poses from arbitrary source schemas into a fixed 21-joint representation using a row-sparse linear mapping regularized by kinematic feasibility (bone-length and jointangle constraints) and a low-capacity temporal residual to interpolate truly missing joints. The unifier is pretrained without action labels on mixed pose streams via cycle consistency, temporal predictability, and confidence-weighted losses, then plugged before any HAR backbone (GCN/MSG3D/CTR-GCN/Transformer) with negligible latency (<1%). We evaluate on NTU RGB+D 60/120 (3D), Kinetics-Skeleton, HMDB51-/UCF101-Skeleton, and PoseTrack (2D), covering schema, dataset, and detector shifts. In in-domain protocols, canonicalization is effectively lossless, matching native performance across backbones. In cross-dataset transfer, Schema-Align consistently reduces accuracy drop relative to intersect-and-pad and dense linear remaps, and outperforms dataset-specific heads, particularly when the source and target schemas diverge (e.g., COCO↔NTU). Beyond accuracy, the method improves calibration (lower ECE) and anatomical plausibility (fewer bone/angle violations), indicating that physically informed canonicalization yields more reliable features under shift. Ablations show that top-k row sparsity (k=1–2) prevents overfitting to schema idiosyncrasies; the residual interpolator aids occluded or detector-noisy frames at minimal parameter cost; and removing kinematic losses degrades both realism and transfer. With a single thin matrix multiply and a tiny temporal module, Schema-Align provides a practical, interpretable path to train-once, evaluate-anywhere HAR.

Downloads

Download data is not yet available.

Article Details

Section

Статті

Author Biographies

Roman V. Kovalevych, Національний університет «Одеська політехніка», пр. Шевченка, 1. Одеса, 65044, Україна

Postgraduate Student of the Department of Artificial Intelligence and Data Analysis

Mykhaylo V. Lobachev, Odesa Polytechnic National University. 1, Shevchenko Ave. Odesa, 65044, Ukraine

PhD, Professor, Head of the Institute of Artificial Intelligence and Robotics

How to Cite

Schema-Align: A lightweight skeleton unifier with kinematic constraints for cross-dataset human action recognition. (2025). Informatics. Culture. Technology, 2, 266–272. https://doi.org/10.15276/ict.02.2025.40

References