3GPP AI/ML for 6G: Use Cases, Architecture & Standardization
IMT-2030 Roadmap  ·  TR 38.843  ·  TR 22.874  ·  TR 23.700-80  ·  Release 18–20 Study Items
NN-based channel estimation & TR 38.843 UC1/UC2
CSI feedback compression via autoencoders
ML-driven beam management & blockage prediction
AI-based sub-meter indoor positioning
Deep RL resource scheduling & multi-cell coordination
BS sleep-mode ML for 50% energy reduction
Semantic & task-oriented communications (JSCC)
End-to-end learned air interface design
Federated learning & split inference at the RAN
AI/ML model lifecycle: training, deployment, monitoring
Generative models for 6G channel synthesis
Digital-twin + ray-tracing channel fusion
NMSE / BER / spectral efficiency KPIs for AI evaluation
Rel-18 AI/ML for NR work items (RAN1/RAN2/SA2)
IMT-2030 native AI objectives & 6G timeline
Open challenges: generalization gap & standardization

This whitepaper surveys the 3GPP standardisation and research landscape for AI/ML integration in 6G systems. We cover 14 technical domains — from channel estimation and CSI feedback compression to federated learning and semantic communications — each grounded in active 3GPP study items and published research. System-level KPI targets from ITU-R IMT-2030 provide the performance context throughout.

ⓘ Citation Framework — 3GPP Normative vs. Academic Research Examples

This whitepaper draws from two distinct source categories. Readers and implementors should be aware of the difference:

✓ 3GPP Normative / Study Items
  • Use-case definitions: TR 38.843 UC1/UC2/UC3
  • KPI targets & evaluation methodology
  • Signalling & feedback formats (CSI reporting, model transfer TR 22.874)
  • AI/ML architecture integration (TR 23.700-80, O-RAN WG2)
  • IMT-2030 performance requirements (ITU-R M.2160)
  • Data collection for AI/ML (TR 37.817)
ⓘ Academic Research Examples (Not 3GPP-Mandated)
  • Specific NN architectures: ChannelNet, CsiNet, TransNet [A1–A3]
  • Semantic comms: DeepJSCC [A4]
  • E2E autoencoder transceiver [A5]
  • FL algorithm: FedAvg, FedProx [A6, A7]
  • RL schedulers: DQN, MADDPG, MAPPO [A8–A10]
  • Privacy: DP, DP-SGD, FGSM [A11–A13]

Key principle: 3GPP standardises what an AI model must achieve (KPIs, conformance envelopes, signalling) — it does not mandate specific neural-network architectures. The academic models listed above are peer-reviewed, publicly available examples illustrating how 3GPP requirements can be met. They are cited in the References section under entries [11]–[14] and [A1]–[A13] and are not proprietary.

3× Rel-17
Spectral Efficiency
100×
Energy Efficiency improvement
< 1 ms
E2E Latency
< 10 cm
Positioning Accuracy (indoor)
10⁷ /km²
Connection Density
1 Tbps
Peak Rate
Native
AI Integration (not plug-in)
2030
6G Deployment Timeline
§1 IMT-2030 Vision

§1.1 — The 6G Imperative

3GPP Releases 15–17 delivered the three pillars of 5G NR: eMBB (enhanced Mobile Broadband), URLLC (Ultra-Reliable Low-Latency Communications), and mMTC (massive Machine-Type Communications). These addressed the 2020 decade's connectivity needs. Yet the horizon of 2030 demands something qualitatively different: a network where artificial intelligence is not a post-hoc optimisation layer bolted on top of engineered signal-processing blocks, but a first-class participant in every link budget, every scheduling decision, and every interference mitigation event.

The International Telecommunication Union Radiocommunication Sector (ITU-R) codified this vision in Recommendation ITU-R M.2160 (11/2023) — Framework and overall objectives of the future development of IMT for 2030 and beyond, which established the technical and conceptual baseline for what is broadly termed 6G. The document's four overarching design principles — sustainability, security and resilience, connecting the unconnected, and ubiquitous intelligence — already signal that AI is not optional but structural.

Against this backdrop, IMT-2030 targets represent step-changes rather than incremental improvements over IMT-2020:

Analogy — Driver vs. Passenger: 5G treated AI as a passenger: it could observe the journey and occasionally suggest a route, but the vehicle (OFDM waveform, LDPC codec, MIMO precoder) ran on fixed mathematical rails. 6G treats AI as the driver. Just as 5G was designed with software-defined networking in mind from the outset, 6G is designed from the ground up with ML inference as a first-class network function — owning scheduling, beam management, channel estimation, and even waveform adaptation in real time.

The engineering implication is profound: classical signal-processing designs rest on tractable mathematical models (Gaussian channels, i.i.d. noise, linear precoding capacity bounds). AI-native 6G must operate reliably in the regime where these models break down — near-field propagation at sub-THz frequencies, extreme multipath in dense urban environments, and heterogeneous ISAC scenarios where the channel is simultaneously a communications medium and a sensing target.

Key insight: The 100× energy efficiency target cannot be achieved by hardware scaling alone. The ITRS roadmap suggests silicon efficiency improves ~2× per process node. Meeting 100× by 2030 requires algorithmic gains — specifically, ML-driven sleep-mode prediction, interference-aware dynamic spectrum sharing, and learned resource allocation that anticipates traffic rather than reacting to it.

§1.2 — Key 6G Usage Scenarios (ITU-R M.2160)

ITU-R M.2160 defines six usage scenarios for IMT-2030. Unlike the IMT-2020 triangle (eMBB / URLLC / mMTC), the IMT-2030 usage map is a hexagon, explicitly adding ISAC and AI/ML Communication as peer scenarios alongside classical broadband and IoT paradigms.

Scenario Full Name Key Applications AI/ML Requirement
IMMB Immersive Mobile Broadband Holographic telepresence, extended reality (XR), 8K 360° video, tactile internet Predictive pre-fetching, view-dependent compression, AI-driven beamforming for Tbps links
HRLLC Hyper-Reliable Low-Latency Comms Tele-surgery, autonomous vehicles (V2X), industrial automation, smart grid control AI-HARQ failure prediction, proactive resource reservation, learned reliability models
MC Massive Communication Industrial IoT, smart city sensors, AMI, environmental monitoring, livestock tracking Traffic prediction, anomaly detection, ML-driven sleep scheduling for 10-year battery
UC Ubiquitous Connectivity Rural broadband via HAPS/satellite, maritime, aviation, underserved regions AI-driven HAPS beam control, satellite-terrestrial handover prediction, coverage ML models
ISAC Integrated Sensing & Communications Vehicular radar, weather sensing, gesture recognition, simultaneous localization and mapping Joint waveform optimization (AI-designed ISAC waveform), clutter suppression NN, target classification
AIAC AI/ML Communication AI model distribution OTA, federated learning over-the-air, semantic communications AI is the payload: model transfer protocol, gradient compression, semantic encoding/decoding

Of these, AIAC is novel to the IMT-2030 framework. It elevates AI from a network management tool to an explicit communication scenario: the network must support efficient transfer of trained models, gradients for federated learning, and semantically compressed data representations. This creates a feedback loop — AI improves the network, and the network carries AI.

Standards note — AIAC & Semantic Comms: Semantic communications — where the transmitter encodes meaning rather than raw bits — are studied in 3GPP SA1 TR 22.874 under AI/ML requirements and in multiple RAN1 study items for Rel-20/6G. The key challenge is defining a shared semantic knowledge base (SKB) between transmitter and receiver, analogous to a shared codebook but at the concept level.

§1.3 — 6G KPI Targets: IMT-2030 vs. IMT-2020

The table below compares the minimum technical performance requirements for IMT-2020 (as defined in ITU-R M.2410) against the IMT-2030 targets (ITU-R M.2160), with the AI enabler mechanism that bridges the gap identified for each KPI.

KPI IMT-2020 (5G NR) IMT-2030 (6G) Primary AI Enabler
Peak DL Rate 20 Gbps 1 Tbps (scenario-dep.) AI near-field beamforming, sub-THz beam prediction
Spectral Efficiency 30 bps/Hz (system) ~100 bps/Hz (system) AI-driven precoding, learned interference coordination
Energy Efficiency Baseline (IMT-2020) 100× per-bit improvement ML sleep-mode prediction, traffic forecasting
User-plane Latency 1 ms (URLLC) 0.1 ms air-interface Predictive pre-scheduling, proactive resource allocation
Reliability 99.999% (5 nines) 99.99999% (7 nines) AI-HARQ, proactive link failure prediction
Positioning Accuracy 10 cm (indoor, FR1) 1–10 cm (indoor/outdoor) AI fingerprinting, multi-modal sensor fusion
Connection Density 106 devices/km² 106–108 devices/km² ML-based access control, grant-free scheduling
Mobility 500 km/h 500–1000 km/h AI channel prediction for high-Doppler environments
Area Traffic Capacity 10 Mbps/m² 30–50 Mbps/m² AI spatial reuse, 3D cell shaping
Coverage ~99% (terrestrial) 99.999% (incl. HAPS/LEO) AI-driven HAPS beam steering, LEO handover ML
Key takeaway: 6G targets represent 1–3 orders of magnitude improvement over 5G. Meeting these simultaneously requires AI-driven adaptation at every layer of the protocol stack — no single traditional technique achieves all targets at once.

[1] ITU-R M.2160 (11/2023) — Framework and overall objectives of the future development of IMT for 2030 and beyond. Geneva: ITU-R, 2023.

[2] ITU-R M.2410-0 (11/2017) — Minimum requirements related to technical performance for IMT-2020 radio interface(s). Geneva: ITU-R, 2017.

§1.4 — The AI-Native Principle

"AI-native" is a loaded term that risks meaning everything and nothing. In the 6G context, ITU-R M.2160 and the emergent 3GPP 6G study items give it a precise technical meaning: AI/ML inference is a specified network function with defined interfaces, lifecycle management, and fallback behaviour — not an implementation detail hidden inside a vendor's baseband DSP.

Levels of AI Integration

We can characterise AI integration into wireless systems at three levels of increasing architectural depth:

Level 1 — Algorithm Replacement

A classical DSP block (e.g., MMSE channel estimator, Viterbi detector) is replaced by a trained neural network with the same I/O interface. The surrounding protocol stack is unchanged. This is the dominant paradigm in 3GPP Rel-18 AI work (TR 38.843 UC3: NN-based DMRS channel estimation).

Advantage: backward-compatible, low standardisation cost.
Limitation: gains bounded by the original algorithm's envelope.

Level 2 — Parameter Optimisation

AI optimises configuration parameters of existing algorithms in real time: beamforming codebook selection, HARQ round-trip prediction, handover thresholds, sleep mode timers. The algorithm structure is fixed; AI chooses the operating point. Prevalent in Rel-17 RAN Intelligence (TR 37.817) and Rel-18/19 network automation.

Advantage: moderate standardisation cost, deployable incrementally.
Limitation: cannot escape sub-optimality of the underlying algorithm.

Level 3 — Protocol Redesign

AI drives the protocol structure itself: semantic communication replaces bit-pipe abstraction; joint source-channel coding replaces the separation theorem's idealization; inference-driven scheduling replaces HARQ-RTT-bounded link adaptation. This is the 6G Phase-1 (2028+) ambition.

Advantage: fundamental capacity gains, new application scenarios.
Limitation: requires complete re-standardisation; interoperability risk.

Level 0 (reference) — Monitoring / SON

AI operates purely in the OAM plane: fault detection, KPI anomaly, capacity planning. No real-time RAN interaction. This covers Rel-15/16 SON/MDT and remains commercially widespread in 5G SA deployments today.

Advantage: no air-interface standardisation needed.
Limitation: cannot affect per-slot or per-beam decisions.

The Fundamental Learning Objective

Regardless of the level, every AI/ML component in the network can be framed as solving a regularised empirical risk minimisation problem. For a model parameterised by \(\boldsymbol{\theta} \in \mathbb{R}^p\):

Eq. 1.1 — Generalisation Objective (ERM)
$$\hat{\boldsymbol{\theta}} = \operatorname*{arg\,min}_{\boldsymbol{\theta}} \; \underbrace{\mathbb{E}_{(\mathbf{x},y)\sim\mathcal{D}} \!\left[\mathcal{L}\!\left(f_{\boldsymbol{\theta}}(\mathbf{x}),\, y\right)\right]}_{\text{generalisation loss}} \;+\; \lambda\,\Omega(\boldsymbol{\theta})$$

where:

Critical challenge — Generalisation across scenarios: In practice, the distribution \(\mathcal{D}\) seen at deployment differs from the training distribution \(\mathcal{D}_\text{train}\). A channel estimator trained on 3GPP CDL-C (clustered delay line) may degrade by 5–8 dB NMSE when deployed in a real LoS-heavy industrial environment. This distribution shift problem is the primary barrier to commercial 6G AI deployment and motivates the federated learning, online adaptation, and model-lifecycle management work in 3GPP SA2/RAN3.

Key Technical Challenges for AI-Native 6G

Challenge Technical Description 3GPP Work Item
Distribution shift Model trained in scenario A performs poorly in scenario B; environment non-stationarity (mobility, frequency, time) TR 38.858 (Rel-19): online model adaptation; TR 23.700-80: model lifecycle
Inference latency NN inference must complete within slot duration (~125 µs for 120 kHz SCS); constrained by UE compute budget TR 38.843: inference endpoint definition (network-side vs. UE-side)
Model size / OTA transfer Transferring a 10 MB model over the air consumes significant DL overhead; compression and delta-update mechanisms needed TR 22.874: model transfer requirements; SA1 requirements for model metadata
Explainability Regulators and operators require interpretable decisions; black-box NN scheduling is unacceptable for safety-critical HRLLC Open study; ETSI ENI GS ENI 010 explainability framework
Privacy / data governance Training data (channel measurements, UE locations) is sensitive; federated learning needed to keep data on-device TR 23.700-80: federated learning support; 3GPP SA3 security requirements
The AI-native protocol stack hierarchy: Traditional wireless stacks enforce strict layering: PHY → MAC → RLC → PDCP → SDAP. An AI-native stack allows cross-layer inference: a single neural network observes raw I/Q samples (PHY input), MAC buffer status, and RRC mobility state simultaneously to produce a joint scheduling + beamforming decision. This violates the OSI model but can yield gains inaccessible to layer-by-layer optimisation.

[3] ITU-R M.2516 (07/2022) — Future technology trends of terrestrial IMT systems towards 2030 and beyond. Geneva: ITU-R, 2022.

[4] Industry Consortium, "A Research Outlook Towards 6G," White Paper, 2024 (updated). Intelligence Everywhere, Distributed Data Infrastructure, Autonomous Operations, AIaaS concepts.

The 6G vision establishes the why; → §2 traces the how — the 3GPP standardization roadmap from Rel-15 MDT through Rel-18/19/20 normative AI work items.

§2 3GPP AI/ML Standardization Roadmap

§2.1 — Release Timeline: From MDT to AI-Native

3GPP's approach to AI/ML has evolved through three distinct phases: (i) measurement and data collection infrastructure (Rel-15/16/17), (ii) AI/ML inference for specific RAN use cases (Rel-18/19, the "5G-Advanced" era), and (iii) AI-native air interface redesign (Rel-20 / 6G Phase-1, 2026–2030).

Release Freeze Year Designation Key AI/ML Items Primary TRs / TSs
Rel-15 2018 5G NR Phase 1 SON (Self-Organising Networks) baseline; MDT (Minimisation of Drive Tests) measurement collection; no dedicated AI/ML inference specs TS 37.320 (MDT); TS 32.422 (SON)
Rel-16 2020 5G NR Phase 2 NR V2X with ML-assisted handover candidate; IAB (Integrated Access & Backhaul); MDT enhancements; NWDAF (Network Data Analytics Function) introduced in 5GC TS 37.320; TS 23.288 (NWDAF); TR 38.867 (V2X)
Rel-17 2022 5G NR Rel-17 RAN Intelligence framework (TR 37.817): data collection architecture, KPM definition; NWDAF enhancements; Timing Resilience; RAN-CN coordination for AI inference split TR 37.817 (RAN intelligence); TS 23.288 Rel-17; TR 28.908 (ANS)
Rel-18 2024 5G-Advanced Phase 1 AI/ML for NR Air Interface (TR 38.843): beam management (UC1), CSI feedback compression (UC2), channel estimation (UC3); Network automation Level 3 (TS 28.xxx series); ISAC Phase 1 (TR 22.837); Ambient IoT; XR & NR-Light enhancements TR 38.843; TR 22.874; TR 37.817 v18; TS 23.288 v18
Rel-19 2025–2026 5G-Advanced Phase 2 AI/ML enhancements for NR (TR 38.858): extended UC coverage, overhead reduction, model compression; Channel modeling for AI (TR 38.901 ext.); SLA-aware AI; Federated learning framework; ISAC Phase 2 TR 38.858; TR 22.874 v19; TR 23.700-80 enhancements
Rel-20 2026–2027 6G Foundation 6G system concept study; AI-native air interface feasibility; Semantic communications study item; Sub-THz channel modelling; AI-driven waveform adaptation TR 38.8xx (6G RAN study); TR 22.8xx (6G SA1 requirements)
6G Ph-1 2027–2028 First 6G Standard IMT-2030-compliant air interface; AI/ML as mandatory PHY/MAC function; ISAC Phase 3; Semantic comm. pilot specs; Native FL support TS 38.xxx (6G NR); IMT-2030 compliance evaluation per M.2160

§2.2 — TR 38.843: AI/ML for NR Air Interface (Rel-18)

3GPP TR 38.843 ("Study on artificial intelligence (AI) / machine learning (ML) for NR air interface") is the pivotal Rel-18 document that translates IMT-2030 AI ambitions into concrete, evaluable use cases for the 5G NR air interface. It was completed in 2024 and constitutes the first time 3GPP RAN formally specifies how AI/ML inference is integrated into the NR physical layer — defining inference endpoints, model transfer mechanisms, and evaluation criteria. Three use cases (UCs) are studied in depth.

UC1 — Beam Management

Problem: In FR2 (mmWave) deployments, the beam sweeping overhead from SSB and CSI-RS measurements can consume 10–15% of slot resources. Beam failure is a leading cause of call drops at cell edge.

AI/ML approach: A neural network observes historical RSRP/RSRQ reports from N preceding slots and predicts the best beam index for the next K slots (beam prediction), reducing the beam sweeping interval. A separate classifier monitors BLER and SINR trends to trigger early beam failure detection before the physical layer declares BFR.

Inference endpoint: Either (a) gNB-side inference using UE measurement feedback, or (b) UE-side inference using locally received RSRPs — TR 38.843 evaluates both and concludes that gNB-side inference is preferred for Rel-18 to minimise UE compute requirements.

Evaluation criteria: Beam prediction accuracy (top-1 and top-K), overhead reduction ratio, L1-RSRP prediction NMSE, false alarm and miss detection rates for beam failure.

Key finding: NN-based beam prediction reduces beam sweeping overhead by 30–60% while maintaining >95% top-1 prediction accuracy in clustered urban macro scenarios — under the 3GPP CDL-D channel model. Generalisation to CDL-C (NLOS) requires either separate model training or domain adaptation.

UC2 — CSI Feedback Compression

Problem: In FDD massive MIMO, the UE must measure the downlink channel and feed it back to the gNB. For a 32-port antenna panel, the raw CSI occupies hundreds to thousands of bits per subframe — a substantial uplink overhead burden that scales with antenna count.

AI/ML approach: A neural-network autoencoder (encoder at UE, decoder at gNB) compresses the full CSI matrix to a low-dimensional codeword. The encoder and decoder are trained jointly end-to-end to minimise reconstruction NMSE subject to a target bit budget (compression ratio η). CsiNet (Wen et al., 2018) established the canonical framework; subsequent models (CsiNet+, TransNet) add multi-rate training and transformer-based attention for long-range spatial-frequency correlation capture.

Inference endpoint: Encoder at UE (compress); decoder at gNB (reconstruct). Both sides must use a compatible model pair — introducing cross-vendor interoperability requirements not present in UC1 or UC3.

Evaluation criteria: NMSE of reconstructed channel vs. perfect CSI; beamforming gain loss relative to perfect CSIT; uplink feedback overhead in bits; UE encoder inference latency.

Key finding: TransNet achieves −15 dB NMSE at η=1/4 using ~80 bits, matching 3GPP Type II Enhanced codebook performance at approximately 5× lower feedback overhead (CDL-C, 32Tx, 13 subbands). Standardisation of the I/O interface (model ID, codeword format) is the primary Rel-18/19 outcome; the internal encoder/decoder architecture remains implementation-defined.

UC3 — Channel Estimation Enhancement

Problem: Classical MMSE channel estimation performance degrades in high-mobility (Doppler) and multi-path-dense scenarios. Pilot density cannot be increased arbitrarily due to spectral efficiency loss.

AI/ML approach: A convolutional or attention-based neural network takes DMRS pilot observations (received pilot tones across frequency and multiple OFDM symbols) and produces an enhanced channel estimate \(\hat{\mathbf{H}}\) across all data subcarriers and time symbols in the slot — effectively performing interpolation, denoising, and Doppler extrapolation simultaneously.

Inference endpoint: UE-side (DL channel estimation) or gNB-side (UL channel estimation from SRS). TR 38.843 focuses primarily on DL, at the UE.

Evaluation criteria: Normalised Mean Square Error (NMSE) of channel estimate vs. perfect-CSI baseline; BLER vs. SNR curves; computational complexity (FLOP/slot) relative to MMSE baseline.

Key finding: NN-based estimators achieve 2–4 dB NMSE gain over LS estimation and 0.5–1.5 dB gain over MMSE at high Doppler (300 km/h, CDL-B), with 10–30 GFLOP/slot at the UE — feasible for mid-range UEs in 2024 silicon.

TR 38.843 Framework Architecture

Framework Element UC1 (Beam Mgmt) UC2 (CSI Feedback) UC3 (Ch. Estimation)
Inference endpoint gNB (preferred Rel-18) UE (encoder); gNB (decoder) UE (DL); gNB (UL)
Training data source Historical RSRP reports (MDT) Simulated CDL channels; paired encoder-decoder training Simulated CDL channels; OTA fine-tuning
Model transfer gNB → UE optional (UE-side variant) Network → UE (encoder model via Uu); gNB decoder vendor-internal Network → UE (Xn/Uu signalling)
Fallback mode Classical codebook beam sweeping (TS 38.214) Type II codebook feedback (TS 38.214) MMSE/LS estimation (TS 38.211)
Open issues (Rel-19) Generalisation to CDL-C/D; online adaptation; overhead of RSRP history signalling Cross-vendor encoder/decoder mismatch; quantization loss; online adaptation Model compression for low-end UEs; SRS-based UL variant

[5] 3GPP TR 38.843 v18.0.0 — Study on artificial intelligence (AI) / machine learning (ML) for NR air interface. 3rd Generation Partnership Project, 2024.

§2.3 — TR 22.874: AI/ML Model Transfer Requirements (SA1)

3GPP TR 22.874 ("Study on traffic characteristics and performance requirements for AI/ML model transfer in 5GS") is the SA1 requirements document that defines what the system must support for AI model distribution — the "logistics layer" without which inference at the UE is infeasible.

Key requirements established by TR 22.874 include:

Model Metadata

  • Model ID and version number (semantic versioning)
  • Input/output tensor format description (shape, dtype, normalisation)
  • Training dataset descriptor: channel model family, SNR range, UE speed, frequency band — enables the UE to assess model applicability
  • Computational complexity declaration (FLOP count, memory footprint)
  • Validity conditions: the environmental envelope within which the model is certified to meet performance targets

Model Lifecycle Management

  • Model versioning and rollback: network must maintain N–1 model version for rollback when performance degrades below threshold
  • Model activation/deactivation signalling: RRC/MAC-CE mechanisms to switch active model at the UE without service interruption
  • Delta update support: transmit only changed weights (sparse diff) to reduce OTA overhead for incremental re-training
  • Model compression: quantisation (INT8/INT4), pruning, knowledge distillation are supported to reduce transfer size

Fallback Behaviour

  • Mandatory fallback to rule-based algorithm when AI model performance monitor detects degradation (NMSE > threshold, beam prediction accuracy < threshold)
  • Fallback trigger: can be autonomous (UE-monitored) or network-commanded (gNB sends model deactivation)
  • Fallback must complete within one slot (125 µs at 120 kHz SCS) to avoid service interruption in URLLC scenarios

Transport Requirements

  • QoS class: AI model transfer uses a dedicated QFI (QoS Flow ID) with high-reliability, large-MTU transport
  • Typical model size range: 100 KB – 50 MB depending on architecture; background transfer preferred to avoid latency spikes
  • Unicast model delivery to UE; multicast delivery to UE groups sharing the same model (cell-specific beam management models)

[6] 3GPP TR 22.874 v18.2.0 — Study on traffic characteristics and performance requirements for AI/ML model transfer in 5GS. 3GPP SA WG1, 2024.

§2.4 — TR 23.700-80: AI/ML Architecture (SA2)

3GPP TR 23.700-80 ("Study on enablers for network automation for the 5G system — Phase 3") is the SA2 document that defines the system-level architecture for deploying AI/ML in the 5G core and RAN. It builds on the NWDAF (Network Data Analytics Function) introduced in Rel-16 and elevates it to a full AI/ML orchestration plane.

Architecture Elements

Element Location Function Rel. introduced
NWDAF 5GC Network data analytics; model training pipeline; analytics service exposure via Nnwdaf API Rel-16
MTLF 5GC (NWDAF sub-function) Model Training Logical Function: manages training jobs, dataset curation, model versioning Rel-17
AnLF 5GC (NWDAF sub-function) Analytics Logical Function: serves inference requests from consumers (AMF, SMF, PCF, gNB) Rel-17
AI/MLNF 5GC / RAN boundary AI/ML Network Function: generic inference server, model repository, lifecycle manager; introduced as architectural evolution in TR 23.700-80 Rel-18/19
gNB AI gNB (gNB-CU / gNB-DU) RAN-side inference: beam management (UC1), CSI feedback decoding (UC2), channel estimation (UC3); receives models via O1/E2 or Xn Rel-18 (TR 38.843)
UE AI UE Device-side inference: DL channel estimation (UC3), beam prediction (UC1 UE variant), CSI feedback encoding (UC2); model stored in UE non-volatile memory Rel-18 (TR 38.843)

Federated Learning Support

A key architectural feature of TR 23.700-80 (Rel-19 target) is federated learning (FL) support — enabling model training without transmitting raw measurement data to the network:

  1. FL server (hosted in NWDAF/MTLF) initialises a global model \(\boldsymbol{\theta}^{(0)}\) and distributes it to participating UEs via the model transfer mechanism (TR 22.874)
  2. Each UE \(k\) computes local gradients \(\mathbf{g}_k = \nabla_{\boldsymbol{\theta}} \mathcal{L}_k(\boldsymbol{\theta})\) on its local channel measurements
  3. UEs transmit compressed gradients (or model diffs) \(\Delta\boldsymbol{\theta}_k\) to the FL server over the NR uplink
  4. The FL server performs federated averaging: \(\boldsymbol{\theta}^{(t+1)} = \boldsymbol{\theta}^{(t)} + \eta \sum_k w_k \Delta\boldsymbol{\theta}_k\) where \(w_k = |D_k| / \sum_j |D_j|\) is the weight proportional to dataset size
  5. Updated global model is pushed back to all UEs; iterate until convergence
Privacy note: While FL prevents raw measurement data from leaving the UE, gradient inversion attacks (Zhu et al., 2019) can reconstruct training data from shared gradients. TR 23.700-80 mandates differential privacy noise injection and secure aggregation protocols to counter this threat in the HRLLC and AIAC scenarios.

Data Collection Architecture

Training data for network-side AI models flows through the existing MDT/RAN measurement infrastructure (TS 37.320), augmented in Rel-17/18:

[7] 3GPP TR 23.700-80 v19.0.0 — Study on enablers for network automation for the 5G system — Phase 3 (Rel-19). 3GPP SA WG2, 2025.

[8] 3GPP TR 37.817 v17.0.0 — Study on enhancement for data collection for NR and ENDC. 3GPP RAN3, 2022.

§2.5 — Standardization Ecosystem

3GPP does not operate in isolation. The AI/ML for 6G standardisation effort is a multi-body endeavour, with different organisations owning different layers of the stack:

Body Scope Key AI/ML Deliverables Interface to 3GPP
3GPP RAN WGs NR air interface PHY/MAC/RRC TR 38.843 (UC1/2/3); TR 38.858 (Rel-19 AI enhancements); Rel-20/6G study — (primary body)
3GPP SA WGs System architecture, services, security TR 22.874 (model transfer); TR 23.700-80 (AI/ML arch.); SA3 AI security SA1→RAN (requirements); SA2→RAN3 (architecture)
ITU-R WP5D IMT spectrum, framework, evaluation M.2160 (IMT-2030 framework); M.2412 (channel models); M.2516 (tech trends) 3GPP submits RIT to ITU-R for IMT-2030 evaluation (2027–2028)
ETSI ENI Experienced Networked Intelligence (management plane) GS ENI 001 (terminology); ENI 005 (architecture); ENI 010 (explainability); ENI 019 (closed-loop automation) Provides AI management framework complementary to 3GPP OAM
O-RAN Alliance Open RAN architecture, xApp/rApp ecosystem Near-RT RIC (xApp AI: <1 s loop); Non-RT RIC (rApp AI: >1 s loop); O1/A1/E2 interfaces for ML model delivery O-RAN specs reference 3GPP NR; O-RAN xApp model delivery uses TR 22.874 mechanisms
IEEE 802.11bf Wi-Fi sensing (ISAC for WLAN) WLAN sensing amendment; AI-based gesture/presence detection Orthogonal to 3GPP NR; convergence expected in 6G heterogeneous ISAC
3GPP methodology note — Use-case-driven standardisation: 3GPP's approach to AI standardisation is deliberately use-case driven: the RAN plenary first identifies concrete, measurable scenarios (beam management, channel estimation, positioning), then defines quantitative evaluation criteria (NMSE, beam prediction accuracy, BLER), and only then specifies the interfaces and signalling necessary to support the evaluated mechanisms. This is a deliberate departure from some academic proposals that start from the AI algorithm and work backwards to a use case. The advantage is that every standardised AI feature has a documented performance baseline against which implementations can be tested. The disadvantage is that purely exploratory AI research (e.g., end-to-end learned transceivers) may take 2–3 release cycles longer to reach a normative spec than in an algorithm-first process.

O-RAN xApp/rApp AI Framework

The O-RAN Alliance's RIC (RAN Intelligent Controller) provides a parallel standardisation track for AI at the network management and resource orchestration layer — complementary to, and operationally integrated with, 3GPP's air-interface AI work:

Convergence point: By 2027, the 3GPP Rel-20 / O-RAN specifications are expected to define a unified RAN AI Function (RAIF) that integrates the O-RAN xApp runtime with the 3GPP NWDAF/AI-MLNF model lifecycle — enabling a single AI/ML model deployed across the O-RAN non-RT RIC (training), near-RT RIC (parameter optimisation, Level 2), and in-gNB inference engine (Level 1 algorithm replacement), all managed via a common O1/E2 model delivery pipeline.

[9] O-RAN Alliance, "O-RAN Architecture Description v07.00," O-RAN.WG1.AD-v07.00, 2023. Near-RT RIC, Non-RT RIC, A1/E2/O1 interfaces.

[10] ETSI GS ENI 010 v1.1.1 — Experiential Networked Intelligence (ENI): Explainability of AI-based network management. ETSI ENI ISG, 2024.

[11] 5G Americas, "5G-Advanced Overview," White Paper, 2024. AI/ML automation, energy efficiency, 6G evolution path.

The standardization roadmap sets the regulatory and specification context. → §3 begins the technical deep-dive, examining AI-based channel estimation — the first and most mature AI use case in the 3GPP TR 38.843 framework.

§3 AI-based Channel Estimation TR 38.843 Use Case 3

§3.1 — The Channel Estimation Problem

In 5G NR, the receiver uses Demodulation Reference Signals (DMRS) to estimate the radio channel before data detection. The gNB or UE observes a pilot-bearing received matrix

Eq. 3.1 — MIMO Pilot Signal Model
$$\mathbf{Y}_p = \mathbf{H}\,\mathbf{X}_p + \mathbf{N}$$

where \(\mathbf{H} \in \mathbb{C}^{N_r \times N_t}\) is the channel matrix, \(\mathbf{X}_p\) are known pilot symbols, and \(\mathbf{N}\) is additive white Gaussian noise with variance \(\sigma_n^2\). The goal is to recover \(\hat{\mathbf{H}}\) for all resource elements, including those carrying data.

Classical Estimators

Least-Squares (LS)

Eq. 3.2 — Least-Squares Channel Estimator
$$\hat{\mathbf{H}}_{\text{LS}} = \mathbf{Y}_p\,\mathbf{X}_p^H \!\left(\mathbf{X}_p\,\mathbf{X}_p^H\right)^{-1}$$

Simple, closed-form, no channel statistics needed. Noise-amplifying — NMSE floor limited by pilot SNR alone.

Linear MMSE (LMMSE)

Eq. 3.3 — MMSE Channel Estimator
$$\hat{\mathbf{H}}_{\text{MMSE}} = \mathbf{R}_H\!\left(\mathbf{R}_H + \sigma_n^2\!\left(\mathbf{X}_p\mathbf{X}_p^H\right)^{-1}\right)^{-1} \hat{\mathbf{H}}_{\text{LS}}$$

Statistically optimal. Requires the channel covariance matrix \(\mathbf{R}_H = \mathbb{E}[\mathbf{h}\mathbf{h}^H]\), which varies with propagation environment.

Key insight: MMSE is optimal when \(\mathbf{R}_H\) is perfectly known, but in practice this matrix must be estimated from data — often in a non-stationary environment. Neural networks can implicitly learn \(\mathbf{R}_H\) from training examples without any explicit statistical modeling step.

Performance Metric

Eq. 3.4 — NMSE Definition
$$\text{NMSE} \;=\; \frac{\mathbb{E}\!\left[\left\|\mathbf{H} - \hat{\mathbf{H}}\right\|_F^2\right]} {\mathbb{E}\!\left[\left\|\mathbf{H}\right\|_F^2\right]}$$

Reported in dB: \(\text{NMSE}_{\text{dB}} = 10\log_{10}(\text{NMSE})\). Lower is better. A gain of 3 dB in NMSE corresponds to halving the mean squared estimation error, directly improving PDSCH throughput.

§3.2 — Neural Network Channel Estimator

The AI channel estimator replaces the classical interpolation + MMSE smoothing block. Three broad architecture families are in active study:

1. Interpolation CNN (ChannelNet style)

Takes the sparse pilot observations \(\mathbf{Y}_p \in \mathbb{C}^{N_p \times N_f}\) placed on the DMRS grid and outputs a dense channel estimate across all resource elements:

Eq. 3.5 — DNN Input Tensor Dimension
$$\hat{\mathbf{H}} \in \mathbb{C}^{N_t \times N_r \times N_f \times N_{\text{sym}}}$$

Convolutional layers exploit local time-frequency correlation. Upsampling layers (bilinear or transposed convolution) fill the data REs from pilot REs. Works well in ETU and EPA channels.

2. Denoising DNN

Uses the LS estimate as input and applies a deep residual network to suppress noise:

Eq. 3.6 — DNN Residual Channel Estimator
$$\hat{\mathbf{H}}_{\text{DNN}} = \hat{\mathbf{H}}_{\text{LS}} - f_\theta\!\left(\hat{\mathbf{H}}_{\text{LS}}\right)$$

Residual learning stabilizes training. The network learns the noise pattern, not the channel directly. Lower computational cost vs interpolation CNN.

3. Transformer-based Estimator

Self-attention over pilot positions captures long-range channel correlations that are difficult to encode in local convolutional kernels. Particularly effective for sparse pilot configurations and high-order MIMO (8+ layers).

5G NR DMRS Pilot Density

For DMRS Type 1 (Release 15 baseline): 6 pilot REs per PRB per DMRS symbol, interleaved with data. For a 132-PRB allocation with 1 DMRS symbol per slot:

  • Pilot REs: 132 \times 6 = 792
  • Total REs per slot: 132 \times 12 \times 14 = 22{,}176
  • Interpolation factor: \approx 28\times

The CNN must interpolate across this 28× gap, exploiting both frequency-domain correlation (coherence bandwidth) and time-domain correlation (coherence time).

Training Loss and Procedure

Eq. 3.7 — Cross-Entropy Training Loss
$$\mathcal{L}_{\text{CE}}(\theta) = \frac{1}{N}\sum_{k=1}^{N} \left\|\mathbf{H}_k - \hat{\mathbf{H}}_k(\theta)\right\|_F^2$$

Training data is generated from a 3GPP channel model (CDL-C, ETU, EPA) using a system-level simulator. Complex-valued inputs are represented as two real channels (real + imaginary stacked along the feature axis). Adam optimizer, learning rate \(10^{-3}\) with cosine decay.

ChannelNet Architecture
Y_p [real/imag stacked]
 → Conv2D(32, 3×3) → BatchNorm → ReLU
 → Conv2D(32, 3×3) → BatchNorm → ReLU
 → Conv2D(32, 3×3) → BatchNorm → ReLU
 → Bilinear Upsample (×pilot_spacing)
 → Conv2D(16, 3×3) → BatchNorm → ReLU
 → Conv2D(16, 3×3) → BatchNorm → ReLU
 → Conv2D(2, 1×1) [linear]
 → Ĥ (complex channel per RE)

Input: real/imag stacked, shape [2, N_p, N_f]. Output: complex channel per RE, shape [2, N_sym, N_f]. Parameters: ~85K (compares to classical MMSE covariance matrix of ~N_f^2 \approx 1{,}584^2 entries).

NMSE vs SNR Performance (ETU-70)

Under the 3GPP ETU-70 channel model (Extended Typical Urban, 70 Hz Doppler), carrier frequency 3.5 GHz, subcarrier spacing 30 kHz (NR μ=1):

  • At SNR = −5 dB: NN → −16.5 dB NMSE; MMSE → −14.5 dB; LS → −0.5 dB
  • At SNR = 0 dB: NN → −21.5 dB; MMSE → −19.0 dB; LS → −5.5 dB
  • At SNR = +10 dB: NN → −28.5 dB; MMSE → −26.5 dB; LS → −15.5 dB
  • At SNR = +20 dB: NN → −35.2 dB; MMSE → −34.5 dB; LS plateaus ≈ −21 dB (interpolation error floor limits further improvement)
The NN advantage is largest at low SNR (2–4 dB gain over MMSE), where accurate channel knowledge is most critical for link reliability. At high SNR, NN and MMSE continue to improve toward the Cramér-Rao lower bound, while LS with linear interpolation plateaus at an error floor set by the pilot spacing and Doppler spread — demonstrating the fundamental limitation of interpolation-based approaches in time-varying channels.

[3] 3GPP TR 38.843 v18.0.0 — Study on Artificial Intelligence (AI)/Machine Learning (ML) for NR Air Interface, §6.3 (Channel Estimation Use Case), 2024.

§3.3 — TR 38.843 Use Case 3 (Channel Estimation)

3GPP's TR 38.843 formally studies AI/ML for the NR air interface. Use Case 3 directly addresses AI-based channel estimation. Key findings from §6.3:

Parameter TR 38.843 Specification Notes
Inference endpoint UE-side or network-side UL: gNB estimates; DL: UE estimates
Model input DMRS measurements (pilot REs) No waveform changes required
Model output Channel estimate on data REs Replaces legacy interpolation block
Standardization scope Input/output interface only Internal AI model not standardized
Signaling needed Model ID, capability flag RRC/MAC-CE based negotiation
Performance target NMSE gain 1–3 dB vs MMSE High-Doppler: v > 120 km/h
Evaluation channels CDL-A/B/C/D/E, ETU, AWGN Per TR 38.901 channel models
Fallback mechanism Revert to MMSE if NMSE > threshold Robustness requirement
Standardization philosophy: 3GPP avoids specifying the internal architecture of the AI model. Only the input format (pilot RE locations + values) and output format (channel estimate grid) are standardized, preserving vendor freedom while enabling interoperability.

Evaluation Scenarios

TR 38.843 defines three evaluation scenarios for Use Case 3 (channel estimation):

  • Scenario A — Indoor Hotspot (InH-Office): Low Doppler (pedestrian 3 km/h), CDL-A channel. Baseline methods already perform well; AI gain is modest (~1 dB).
  • Scenario B — Urban Macro (UMa): Medium Doppler (30 km/h), CDL-C. Moderate AI benefit (~1.5–2 dB) due to intra-slot channel variation.
  • Scenario C — High Mobility (V2X): High Doppler (120–500 km/h), ETU-70. Maximum AI benefit (~2–4 dB) because MMSE static-within-slot assumption breaks down.
Model mismatch risk: An AI model trained on CDL-C may degrade below LS performance when deployed in a CDL-A or outdoor LOS environment. TR 38.843 recommends multiple models indexed by scenario, with UE-gNB negotiation to select the appropriate model ID.

§3.4 — Doppler and Multi-path Extension

The channel estimation challenge intensifies for high-mobility UEs (vehicular, high-speed rail, V2X). At 120 km/h and 3.5 GHz carrier frequency, the Doppler spread is:

Eq. 3.8 — Doppler Frequency Calculation
$$f_D = \frac{v \cdot f_c}{c} = \frac{(120/3.6)\,\text{m/s} \times 3.5\times10^9\,\text{Hz}} {3\times10^8\,\text{m/s}} \approx 389\,\text{Hz}$$

For NR μ=1 (slot duration 0.5 ms), the channel coherence time \(T_c \approx 1/(4f_D) \approx 0.64\,\text{ms}\) is comparable to the slot duration. The classical quasi-static assumption — channel constant within one slot — fails.

Temporal Extrapolation via Recurrent Networks

Neural networks address this by learning the temporal evolution of the channel. A recurrent estimator (LSTM or temporal CNN) takes a history of past channel estimates and predicts the current slot:

Eq. 3.9 — DNN Channel Predictor
$$\hat{\mathbf{h}}(t+\delta) = f_\theta\!\left[\mathbf{h}(t),\,\mathbf{h}(t-1),\,\ldots,\,\mathbf{h}(t-k+1)\right]$$

where \(\mathbf{h}(t) \in \mathbb{C}^{N_r N_t}\) is the vectorised channel (stacked columns of H), and k is the prediction history length (look-back window).

This mirrors the classical Wiener-Hopf predictor:

Eq. 3.10 — Wiener Filter Channel Predictor
$$\hat{\mathbf{h}}(t+\delta) = \mathbf{r}_{hh}^H(\delta)\, \mathbf{R}_{hh}^{-1}\, \left[\mathbf{h}(t),\,\mathbf{h}(t-1),\,\ldots,\,\mathbf{h}(t-k+1)\right]^T$$

where \(\mathbf{r}_{hh}(\delta)\) is the temporal autocorrelation vector (Jakes model for isotropic scattering) and \(\mathbf{R}_{hh}\) is the \(k \times k\) channel correlation matrix.

NN advantage over Wiener-Hopf: The classical predictor requires explicit knowledge of the Doppler spectrum \(f_D\) to compute \(\mathbf{r}_{hh}\). The neural predictor implicitly learns this from data, and can handle non-isotropic, non-stationary Doppler distributions (e.g., highway convoy, tunnels) without re-parameterization.

Multi-path Structure Learning

In frequency-selective channels, the impulse response consists of L discrete paths:

Eq. 3.11 — CIR Multipath Model
$$h(\tau, t) = \sum_{\ell=1}^{L} \alpha_\ell(t)\,\delta(\tau - \tau_\ell)$$

Exploiting this sparse delay-domain structure:

  • Delay-domain NN: Transform pilot observations to delay domain via IDFT, apply sparse recovery (ISTA-Net), transform back. Effective when L \ll N_f.
  • Angle-delay domain (massive MIMO): For large antenna arrays, channel is sparse in the angle-delay domain. 2D-CNN on angle-delay representation achieves near-oracle NMSE.
  • Super-resolution: Off-grid path delay estimation via atomic norm minimization — AI version uses learned dictionaries.
Analogy — Image Denoising vs Channel Estimation: The ChannelNet approach is directly analogous to DnCNN for image denoising. In both cases: a clean signal (true channel / clean image) is corrupted by noise (thermal noise / AWGN). A CNN learns to remove the noise by training on pairs of (noisy, clean). The 2D time-frequency grid of the channel plays the same role as the 2D spatial grid of a grayscale image.
Research direction — NTN channel AI (TR 38.821): Non-Terrestrial Networks (NTN), introduced in 3GPP Rel-17 (TR 38.821), include LEO satellites, HAPS platforms, and GEO feeder links. NTN channels present an extreme case of the Doppler and delay challenges: a LEO satellite at 600 km altitude and 7.5 km/s orbital velocity induces Doppler shifts up to ±48 kHz at 3.5 GHz — two orders of magnitude larger than vehicular Doppler — and propagation delays of 2–4 ms one-way (vs. <1 μs for terrestrial). Classical HARQ timing and channel estimation assumptions break entirely. AI-based channel prediction for NTN is an emerging research area: recurrent networks trained on Keplerian orbital dynamics can predict time-varying Doppler profiles with sub-100 Hz accuracy, enabling pre-compensation before OFDM reception. This makes AI channel estimation especially valuable for NTN 6G scenarios where the satellite ephemeris is known but multipath and atmospheric effects are not.

§3.5 — Overhead and Deployment Considerations

Model Size and UE Feasibility

The ChannelNet architecture (~85K parameters) requires approximately:

  • Storage: 340 KB (FP32) or 85 KB (INT8)
  • Multiply-accumulate ops: ~15 M MACs per slot
  • Inference latency: 0.1–0.5 ms on ARM Cortex-A75
  • Comparable to legacy MMSE covariance update cost

Larger transformer-based estimators (300K–2M parameters) target gNB-side uplink estimation where compute constraints are relaxed.

Model Delivery Mechanism

TR 22.874 (Requirements for AI/ML management) defines a framework for over-the-air model transfer:

  • Model transmitted via PDSCH (gNB → UE) or PUSCH (UE → gNB)
  • Model identified by a Model ID signaled in RRC
  • UE capability flag: "AI_CE_supported = 1"
  • Model update triggered by network OAM on environmental change
  • Delta updates possible (fine-tuning weights only)

Fallback and Robustness

A critical requirement for any deployed AI component is a graceful degradation path:

Condition Action Trigger
NMSE > −5 dB (runtime) Switch to MMSE fallback Online NMSE monitoring
Model ID mismatch Re-request model from gNB RRC model negotiation failure
UE compute overload Use LS estimator + gNB-side equalization UE thermal throttling signal
No AI model loaded Full MMSE (Rel-15 behavior) Default state on power-up
Deployment risk — distribution shift: A model trained on CDL channel data from one geographic region may show degraded NMSE in environments with different cluster angles, delay spreads, or Doppler profiles. Online adaptation (few-shot fine-tuning using received pilots) is an active research direction to mitigate this.

Summary: §3 Key Takeaways

  • AI channel estimation replaces the MMSE interpolation block with a learned function that implicitly captures channel statistics.
  • NMSE gains of 2–4 dB vs MMSE are achievable in high-Doppler scenarios (ETU-70, v > 120 km/h).
  • Model size (~85K parameters, ~340 KB) is compatible with UE storage and inference latency budgets.
  • 3GPP TR 38.843 Use Case 3 standardizes the I/O interface, leaving internal architecture to implementation choice.
  • Fallback to MMSE/LS is mandatory for robustness; model ID negotiation via RRC enables multi-environment deployment.

Channel estimation forms the first stage of the AI radio pipeline. → §4 extends these ideas to CSI feedback compression, where the estimated channel must be encoded and reported back to the gNB.

§4 AI-based CSI Feedback Compression TR 38.843 Use Case 2

§4.1 — The CSI Feedback Bottleneck

Massive MIMO beamforming requires the gNB to know the downlink channel matrix. In FDD, the UE must estimate the channel and report it back. The feedback overhead scales with antenna count — and in Release 15/16 massive MIMO configurations this becomes a significant resource burden.

Raw CSI Dimensionality

For a 32TRX panel (typical commercial deployment at sub-6 GHz):

Eq. 4.1 — CSI Feedback Bit Budget
$$\underbrace{32}_{\text{TX ports}} \times\underbrace{1}_{\text{RX port}} \times\underbrace{13}_{\text{subbands (100 MHz)}} \times\underbrace{2}_{\text{real+imag}} = 832\text{ real values} = 26{,}624\text{ bits (FP32)}$$

In practice, 5G NR codebook feedback compresses this substantially, but still requires significant uplink resources:

Feedback Type Bits per Subband Total Bits (13 SB) NMSE BF Gain
Type I Single Panel 4–11 bits 52–143 bits −8 dB 8–10 dB
Type II Basic 10–16 bits 130–208 bits −12 dB 12–14 dB
Type II Enhanced (Rel-16) 16–32 bits 208–416 bits −14 dB 14–16 dB
CsiNet (η=1/4) ~100 bits total −10 dB 12 dB
TransNet (2023) ~80 bits total −15 dB 15 dB
Core observation: AI-based compression (TransNet) achieves −15 dB NMSE using only ~80 bits, matching Type II Enhanced performance at a fraction of the feedback overhead. The channel matrix is not random — it lives on a low-dimensional manifold determined by propagation geometry, and neural networks learn this manifold efficiently.
CSI Aging — the high-velocity problem: At UE velocities above 30 km/h, there is a non-trivial delay between when the UE measures the channel and when the gNB applies the reported PMI for precoding. This CSI aging effect causes the reported PMI to describe a channel that has already changed, degrading beamforming gain in proportion to velocity and carrier frequency. AI-based approaches address this by predicting future CSI: the UE encodes not the current channel estimate but a prediction of the channel state at the anticipated precoding epoch, using LSTM or temporal-CNN extrapolators trained on the Doppler statistics of the deployment environment. This is an active Rel-19 study item within TR 38.843 use case 2 extensions.

§4.2 — CsiNet: The Autoencoder Framework

CsiNet (Wen et al., 2018) established the canonical deep-learning framework for CSI feedback compression. It frames the problem as a learned vector quantization via an autoencoder:

Encoder (UE side)

Eq. 4.2 — CSI Encoder (Feedforward)
$$\mathbf{c} = f_{\theta_e}(\mathbf{H}) \in \mathbb{R}^k$$

The encoder compresses the full channel matrix \(\mathbf{H} \in \mathbb{C}^{N_t \times N_r \times N_f}\) into a low-dimensional codeword \(\mathbf{c}\). The compression ratio is:

Eq. 4.3 — Compression Ratio η
$$\eta = \frac{k}{2\,N_t N_r N_f}$$

For N_t=32, N_r=1, N_f=13 and \(\eta = 1/4\): k = 2 \times 32 \times 1 \times 13 / 4 = 208 real values → quantized to ~100 bits.

Decoder (gNB side)

Eq. 4.4 — CSI Decoder (Reconstruction)
$$\hat{\mathbf{H}} = g_{\theta_d}(\mathbf{c})$$

The decoder reconstructs the full channel estimate from the compressed codeword. Trained end-to-end:

Eq. 4.5 — CsiNet Training Objective
$$\min_{\theta_e,\,\theta_d} \;\mathbb{E}\!\left[\left\|\mathbf{H} - \hat{\mathbf{H}}\right\|_F^2\right] \;\text{s.t.}\;\text{rate}(\mathbf{c}) \leq B_{\text{target}}$$

The rate constraint is handled by training with a fixed k (bottleneck dimension) and applying post-training scalar quantization on \(\mathbf{c}\).

CsiNet Architecture Detail

CsiNet Encoder (UE)
H [2, N_t, N_f] (real/imag)
 → Conv2D(2, 3×3) → BN → LeakyReLU
 → Flatten [2 × N_t × N_f]
 → FC(k) [bottleneck]
 → c ∈ ℝ^k


CsiNet Decoder (gNB)
c ∈ ℝ^k
 → FC(2 × N_t × N_f) → Reshape [2, N_t, N_f]
 → [RefineNet Block] × 2
    (Conv2D(8,3×3) → BN → ReLU → Conv2D(16,3×3) → BN → ReLU
     → Conv2D(2,3×3) → BN + skip connection)
 → Sigmoid → Ĥ

Total parameters: ~2.1 M (encoder 0.3 M + decoder 1.8 M)

Performance Progression: CsiNet → CsiNet+ → TransNet

Model Year η = 1/32 NMSE η = 1/16 NMSE η = 1/8 NMSE η = 1/4 NMSE Key Innovation
CsiNet 2018 −6.0 dB −8.0 dB −9.5 dB −10.0 dB Baseline autoencoder + RefineNet
CsiNet+ 2022 −8.5 dB −11.0 dB −12.5 dB −14.0 dB Multi-rate training + dense connections
TransNet 2023 −10.0 dB −12.5 dB −14.0 dB −15.0 dB Transformer encoder + cross-attention decoder
Type II Enhanced Rel-16 (fixed overhead, not variable η) −14.0 dB 3GPP codebook baseline
CsiNet+ milestone: At η=1/4, CsiNet+ (−14 dB NMSE) first matched Type II Enhanced performance. At η=1/32, it still achieves −8.5 dB — approximately Type I territory — using only 26 bits of feedback for a 32-port system.

§4.3 — 3GPP Type II vs AI CSI: Full Comparison

Method Bits/Slot (13 SB) NMSE BF Gain UL Overhead UE Complexity
Type I Single Panel 52–143 bits −8 dB 8–10 dB Low Very low
Type II Basic 130–208 bits −12 dB 12–14 dB Moderate Low
Type II Enhanced 208–416 bits −14 dB 14–16 dB High Moderate
CsiNet (η=1/4) ~100 bits total −10 dB 12 dB Very low Moderate (encoder)
CsiNet+ (η=1/4) ~100 bits total −14 dB 14 dB Very low Moderate
TransNet (2023) ~80 bits total −15 dB 15 dB Very low High (transformer)

Practical Throughput Impact

The UL feedback overhead directly reduces DL capacity. For a 20 MHz uplink (30 kHz SCS, 51 PRBs):

  • Type II Enhanced (416 bits) consumes ~2.3 PRBs per slot purely for CSI feedback — approximately 4.5% of UL capacity.
  • AI CSI (80–100 bits) consumes <0.6 PRBs — <1.2% of UL capacity.
  • The freed UL resources can carry data, SRS, or additional reference signals, yielding a 3–4× reduction in feedback overhead at equivalent reconstruction quality.
Analogy — JPEG vs learned image compression: Type II codebook feedback is analogous to JPEG (hand-engineered DCT + quantization table). CsiNet/TransNet is analogous to learned image compression (Balle et al., 2018) — both use autoencoder architectures with entropy-regularized bottlenecks. The learned approach surpasses the hand-engineered baseline at the same bit rate, for the same reason: the learned latent space is optimally matched to the data manifold (channel geometry), not to a generic DCT basis.

§4.4 — Transformer-based CSI Compression (TransNet)

CsiNet's convolutional encoder treats the channel matrix as a 2D image. This misses long-range correlations across the subband dimension (frequency coherence) and the port dimension (spatial coherence in large aperture arrays). TransNet introduces self-attention to capture these global dependencies.

Attention Mechanism Review

Eq. 4.6 — Scaled Dot-Product Attention
$$\text{Attention}(Q, K, V) = \text{softmax}\!\left(\frac{QK^T}{\sqrt{d_k}}\right) V$$

For CSI feedback, the queries, keys, and values are derived from the channel matrix as follows:

  • Q = f_Q(\mathbf{H}_{\text{subband},\,i}) — query: representation of subband i
  • K = V = f_{KV}(\mathbf{H}_{\text{all subbands}}) — keys/values: full channel across all subbands

This allows each subband's encoding to attend to all other subbands, learning the frequency correlation structure that classical codebooks approximate with a fixed DFT basis.

TransNet Architecture

TransNet Encoder (UE)
H [2, N_t, N_f] (real/imag stacked)
 → Linear Embedding → token sequence [N_f tokens × d_model]
 → Positional Encoding (subband index)
 → Transformer Encoder Layer × L_e
    (Multi-Head Self-Attention [8 heads, d_k=64]
     → LayerNorm + skip → FFN(d_ff=256) → LayerNorm + skip)
 → [CLS] token extraction → FC(k) → c ∈ ℝ^k

TransNet Decoder (gNB)
c ∈ ℝ^k
 → FC(N_f × d_model) → Reshape [N_f × d_model]
 → Transformer Decoder Layer × L_d
    (Cross-Attention [c as query, latent as key/value]
     → FFN → LayerNorm)
 → Linear → Ĥ [2, N_t, N_f]

Parameters: ~4.2 M (encoder 1.8 M + decoder 2.4 M). Inference: ~2 ms on Snapdragon 888 (UE encoder only).

Multi-Head Attention over Port Dimension

For massive MIMO (N_t ≥ 32), an additional attention head is applied across the port (antenna) dimension:

Eq. 4.7 — Port-Domain Attention (CSI-ViT)
$$\mathbf{A}_{\text{port}} = \text{softmax}\!\left( \frac{\mathbf{Q}_{\text{port}}\,\mathbf{K}_{\text{port}}^T} {\sqrt{d_p}} \right) \mathbf{V}_{\text{port}}$$

This captures spatial correlation across the antenna array — the beam domain structure — without requiring explicit DFT pre-coding to an angle domain representation.

NMSE vs Compression Ratio

Key performance comparison across compression ratios \(\eta \in \{1/32,\,1/16,\,1/8,\,1/4\}\):

§4.5 — 3GPP Standardization (TR 38.843 §6.2)

TR 38.843 §6.2 evaluates AI/ML for CSI feedback enhancement as Use Case 2. The standardization scope is broader than Use Case 3 (channel estimation) because the feedback traverses the air interface: the encoder (UE) and decoder (gNB) are implemented by different vendors, creating a need for interoperability specification.

3GPP Architecture Options

Architecture Encoder Side Decoder Side Standardization Status
Option 1 UE (AI) gNB (AI) I/O interface + model ID Primary candidate
Option 2 UE (AI) gNB (legacy codebook) Encoder output = legacy codeword Backward-compatible
Option 3 UE (legacy) gNB (AI decoder only) No UE changes needed Incremental upgrade path

Open Issues Identified in TR 38.843

1. Encoder specification:

The UE encoder is not standardized — any compression function that produces an output fitting the defined bit format is permissible. This preserves vendor innovation but creates model mismatch risk: a UE trained with encoder f_{\theta_e^{(A)}} reports feedback that gNB decoder g_{\theta_d^{(B)}} (from a different vendor, trained with a different encoder) cannot reconstruct.

2. Decoder standardization:

Two approaches under discussion in Rel-18:

  • Standardized decoder: A reference gNB decoder is specified in the standard. UEs must train their encoders against this reference. Ensures interoperability; limits decoder innovation.
  • Signaled model pair: The network signals a Model ID pair (encoder ID + decoder ID) to the UE via RRC. The UE downloads the encoder; the gNB uses the paired decoder. Flexible but requires model management infrastructure.

3. Online learning and adaptation:

Models trained offline on synthetic CDL channels may mismatch real deployment channels. TR 38.843 proposes:

  • Online fine-tuning: UE collects CSI samples during operation, performs gradient updates to the encoder using feedback from gNB (requires a new feedback loop for gradient or loss information).
  • Model selection: UE chooses from a library of pre-trained encoders (indexed by environment type: indoor/outdoor, LoS/NLoS) based on detected channel statistics.
  • Meta-learning: Encoder trained to be quickly fine-tunable with few environment-specific samples (MAML-style approach).

4. Quantization and entropy coding:

The continuous bottleneck vector \(\mathbf{c}\) must be quantized for transmission. Two approaches:

  • Scalar quantization: Each element of \(\mathbf{c}\) quantized to b bits independently. Simple, adds ~1–2 dB NMSE loss vs unquantized.
  • Vector quantization (learned codebook): VQ-VAE style — jointly optimize encoder + VQ codebook. Achieves near-unquantized performance but higher UE complexity.

RRC Signaling for AI CSI

The following new information elements are under study for Rel-18/19:

IE Name Layer Content
ai-CSI-Config RRC AI CSI enabled flag, model ID, encoder resolution (k)
ai-Model-Request RRC UE requests model download; specifies environment index
ai-CapabilityReport RRC UE reports max k, supported model IDs, inference latency
ai-CSI-Feedback PUCCH/PUSCH Quantized codeword c (B_target bits)
Timeline: TR 38.843 (Study Phase) was completed in June 2024. Normative work for AI CSI feedback began in Rel-18 WI "Artificial Intelligence (AI)/Machine Learning (ML) for NR" (RP-222662), with initial specifications targeting 2026. Rel-19 is expected to include full standardization of at least one interoperable AI CSI feedback mode.

[4] 3GPP TR 38.843 v18.0.0, §6.2 — CSI Feedback Enhancement Use Case, 2024.

[5] W. Wen et al., "Deep Learning for Massive MIMO CSI Feedback," IEEE Wireless Communications Letters, vol. 7, no. 5, pp. 748–751, 2018.

§4.6 — Summary: §3–4 Key Takeaways

§3 Channel Estimation

  • AI replaces MMSE interpolation; no waveform changes.
  • NMSE gains: 2–4 dB at low SNR and high Doppler (ETU-70, v > 120 km/h).
  • Model size feasible for UE: ~85K params, ~340 KB FP32, <0.5 ms inference.
  • 3GPP TR 38.843 Use Case 3 standardizes I/O interface only; model architecture is implementation-defined.
  • Mandatory fallback to MMSE/LS when NMSE exceeds threshold.

§4 CSI Feedback Compression

  • AI autoencoder (CsiNet/TransNet) compresses 32Tx CSI to 80–100 bits vs 208–416 bits for Type II Enhanced.
  • TransNet achieves −15 dB NMSE at η=1/4, matching or exceeding Type II Enhanced at 5× lower overhead.
  • Core challenge: encoder (UE) and decoder (gNB) are from different vendors — model mismatch requires standardized decoder or Model ID negotiation.
  • Online adaptation essential for non-stationary real environments.
  • Rel-18/19 normative work targets first interoperable AI CSI mode.

References for §3–4

  1. 3GPP TR 38.843 v18.0.0, §6.3 — Channel Estimation Use Case, 2024.
  2. 3GPP TR 38.843 v18.0.0, §6.2 — CSI Feedback Enhancement Use Case, 2024.
  3. W. Wen et al., "Deep Learning for Massive MIMO CSI Feedback," IEEE Wireless Communications Letters, vol. 7, no. 5, pp. 748–751, 2018.
  4. J. Guo et al., "Convolutional Neural Network-based Multiple-rate Compressive Sensing for Massive MIMO CSI Feedback: Design, Simulation, and Analysis," IEEE Transactions on Wireless Communications, vol. 20, no. 4, pp. 2827–2840, 2021.

CSI compression solves the feedback bottleneck on the uplink; the resulting beam selection and precoding are applied on the downlink. → §5 examines how AI-based beam management exploits this CSI to predict optimal beams and handle blockage events.

§5   AI-based Beam Management

§5.1   The Beam Management Challenge

Millimetre-wave (FR2) and sub-6 GHz massive-MIMO deployments in 5G NR rely on a codebook of narrow beams to overcome the high path-loss at these frequencies. The base station (gNB) and UE must continuously search for, and track, the best-aligned beam pair — a process that consumes significant time and energy as the number of antennas scales.

5G NR Beam Sweeping — How It Works Today

5G P1–P2–P3 Procedure

Beam management in 5G NR is a three-phase procedure defined in 3GPP TS 38.214:

P1 — Beam Sweep (Initial Acquisition)
gNB transmits SSB burst across all beams; UE selects best TX beam and reports L1-RSRP. Periodicity: typically 20 ms.
P2 — Beam Refinement
Narrower CSI-RS beam refinement within the P1 winner neighbourhood. Finer angular resolution; periodicity 5–10 ms.

P3 — Beam Tracking — UE-side beam tracking using aperiodic CSI-RS triggered after detected mobility. The full P1→P2→P3 chain is purely reactive: the system responds to a degraded measurement after blockage has occurred.

Key limitation: the reactive P1–P2–P3 architecture incurs interruption latency every time a blockage event occurs. For high-Doppler UEs (vehicles at 120 km/h) or dense urban deployments the beam stays valid for only tens of milliseconds, making continuous sweeping prohibitively expensive.

6G Target: Predictive Beam Management

6G beam management aims to anticipate blockage before it happens. By training a neural predictor on past RSRP time series — and optionally on sensing side-information — the system can pre-switch to the next best beam proactively, eliminating BFR latency and reducing sweep overhead.

§5.2   Beam Prediction — TR 38.843 Use Case 3

3GPP TR 38.843 (Rel-18) defines Use Case 3 (UC3) as AI/ML-assisted beam management. The study evaluates neural predictors that consume a window of past L1-RSRP measurements and output a predicted best-beam index one to four slots in the future.

Model Input / Output

Input vector at time t:

Eq. 5.1 — LSTM Input Feature Vector
xt = [ r(t-k), r(t-k+1), …, r(t-1), r(t) ]   ∈ ℝNB×(k+1)

where r(t) is the vector of RSRP measurements across all NB beams at slot t, and k is the look-back window (typically 4–8 slots).

Output: predicted best-beam index at t+δ (δ = 1…4 slots).

LSTM-based Predictor Architecture

The baseline architecture in TR 38.843 evaluations is a single-layer LSTM followed by a linear classification head:

Eq. 5.2 — LSTM Beam Predictor
$$h_t = \text{LSTM}(x_t,\; h_{t-1})$$ $$\hat{y}_{t+\delta} = \text{softmax}(W \cdot h_t + b)$$

The hidden state ht ∈ ℝ64 captures temporal correlation in the RSRP time series. A Transformer variant replaces the LSTM with multi-head self-attention over the input window, providing better long-range dependency modelling at slightly higher complexity.

Training Loss

A composite loss is used to simultaneously optimise beam-classification accuracy and RSRP prediction quality:

Eq. 5.3 — Beam Management Loss
$$L_{\text{BM}} = -\sum_{k=1}^{N_B} y_k \log\hat{y}_k \;+\; \lambda \cdot \text{RMSE}(\text{RSRP})$$

The first term is the standard cross-entropy classification loss over NB beam classes. The second term penalises RSRP prediction error, encouraging the hidden representation to encode channel quality faithfully. λ ≈ 0.1 balances the two tasks.

TR 38.843 Evaluation Results

Metric δ = 1 slot δ = 2 slots δ = 4 slots
Top-1 beam accuracy 72–85 % 65–78 % 55–68 %
Top-3 beam accuracy 90–95 % 85–92 % 78–87 %
UE energy saving ~35 % ~30 % ~22 %
Latency saving (skip P1) 2–4 ms 1.5–3 ms 0.5–1.5 ms
Model size (parameters) ~10 K (LSTM)  /  ~25 K (Transformer)

The top-3 accuracy of 90–95 % means that the true best beam is in the predicted shortlist with very high probability — allowing the system to skip the full P1 sweep and only test 3 candidate beams instead of 32, a 10× reduction in sweep overhead.

§5.3   Blockage Prediction with Side Information

Pure RSRP-based prediction cannot anticipate blockage events that have not yet affected the measured channel — e.g. a fast-moving obstacle entering the Fresnel zone for the first time. Multi-modal fusion addresses this by incorporating sensing side-information alongside RF measurements.

Side Information Sources

Blockage Probability Model

A binary classifier predicts the probability that LOS will be blocked at time t+δ given the fused input sequence:

Eq. 5.4 — Blockage Probability Predictor
$$P_{\text{block}}(t+\delta) = \sigma\!\left(f_\theta\!\left([r_{t-k:t},\; \mathbf{s}_{t-k:t}]\right)\right)$$

where rt−k:t is the windowed RSRP sequence, st−k:t is the optional side-information vector (radar returns, GPS velocity, object-detection features), σ is the sigmoid activation, and fθ is the trained neural network (LSTM or Transformer body).

Analogy — Weather forecasting: a meteorologist does not wait for the storm to arrive before issuing a warning — they use atmospheric pressure trends, satellite imagery, and historical patterns to predict it hours in advance. AI beam management applies the same principle: sense the approaching "storm" (obstacle) from radar/camera data, and switch beam proactively before signal quality degrades.

Joint Beam + Blockage Decision Logic

  1. At each slot, compute Pblock(t+δ) from multi-modal input.
  2. If Pblock > threshold τB (e.g. 0.7): trigger proactive beam switch to top-3 predicted candidates without waiting for RSRP drop.
  3. If Pblock < τB: remain on current beam; suppress P1 sweep (energy saving mode).
  4. If prediction confidence is low: fall back to classical P1–P2–P3 procedure.

§5.4   Standardization Impact — TR 38.843

3GPP TR 38.843 was the primary Rel-18 study item for AI/ML-based air-interface enhancements. Beam management (UC3) was the first use case to reach conclusion status, providing key architectural decisions that will feed Rel-19 normative work.

Agreed Architectural Elements

Inference endpoint:

  • UE-side inference: UE runs beam predictor locally; requires model transfer from gNB via PDSCH + RRC configuration IE.
  • gNB-side inference: UE reports raw L1-RSRP measurements; gNB performs prediction and signals result via DCI or MAC-CE.
  • Both options agreed for study; Rel-19 to decide normative split.

Input feature set:

  • L1-RSRP (mandatory) — per-beam, per-slot window.
  • L3 measurement reports (optional) — filtered RSRP/RSRQ.
  • UE velocity (optional) — from UE capability reporting.
  • Time-domain index within SSB burst (optional).

Output format:

  • Top-K beam indices (K = 1, 2, or 3).
  • Per-beam confidence score (quantised, e.g. 3-bit).
  • Prediction horizon δ configured by gNB.

Signaling:

  • No new over-the-air signal required — existing measurement reports and DCI formats reused.
  • Model transfer: gNB pushes model binary via PDSCH; model ID referenced in RRC reconfiguration message.
  • Model lifecycle: activate / deactivate / update controlled by gNB RRC.
TR 38.843 beam management AI was the first 3GPP use case to reach conclusion status (Study complete in Rel-18). It demonstrated that a neural predictor with ~10 K parameters can outperform the classical P1–P2–P3 procedure in high-Doppler environments, validating the practical viability of on-device AI inference with minimal model footprint and no new air-interface signals.

3GPP 3GPP TR 38.843 §6.2 — AI/ML for NR Air Interface, Beam Management Use Case (UC3). Study concluded Rel-18; normative impact expected Rel-19 (TS 38.214 amendments).

Beam Prediction Accuracy vs UE Velocity

The chart below illustrates how prediction accuracy degrades with UE velocity for the three approaches: classical P2 (no AI), LSTM predictor, and Transformer predictor (δ = 1 slot).

§5.5   6G Extensions — ISAC-aided Beam Management

6G introduces Integrated Sensing and Communications (ISAC) as a first-class physical layer function. The same OFDM waveform simultaneously carries data and performs radar sensing within the same time-frequency resource, enabling an entirely new category of AI beam management.

ISAC Frame Structure

6G Beam Management Targets (IMT-2030)

Parameter 5G NR (achieved) 6G Target (IMT-2030)
Beam sweep overhead ~5–10 % of frame < 1 % (AI-suppressed sweeps)
Beam failure recovery latency 10–50 ms (reactive BFR) < 1 ms (proactive switch)
Max supported UE velocity 500 km/h (HST) 1000 km/h (hypersonic)
Beam direction prediction error N/A < 1° (ISAC + ML)
UE power saving vs 5G P1 35–50 %
ISAC-aided beam management represents a fundamental architectural shift: the gNB transitions from a passive receiver of UE measurement reports to an active environment sensor that constructs a real-time spatial model of the deployment area. The AI predictor then operates over a rich, fused state space rather than sparse, delayed RSRP reports.

[6] 3GPP TR 38.843 v18.0.0, §6.2 — Beam Management Use Case (UC3): AI/ML-based beam prediction, model transfer, and inference endpoint architecture.

AI beam management naturally complements AI positioning. → §6 covers how fine-grained location estimates feed back into beam selection and link adaptation.

§6   AI-based Positioning Enhancement

§6.1   5G Positioning — Limitations and Baseline

3GPP Release 16 introduced a dedicated positioning layer for 5G NR, standardised in TS 38.305 and studied in TR 38.855. Release 17 further refined the techniques and tightened accuracy requirements. Despite these advances, the current architecture faces fundamental physical-layer challenges that AI/ML can substantially address.

5G NR Positioning Methods (Rel-16/17)

Method Principle Direction Accuracy (3GPP Req.)
DL-TDOA Downlink Time Difference of Arrival across multiple TRPs Downlink 3 m outdoor / 0.5 m indoor
UL-TDOA Uplink TDOA — network measures UE SRS across TRPs Uplink 3 m outdoor / 0.5 m indoor
DL-AoD Downlink Angle of Departure from gNB antenna array Downlink 5 m typical
UL-AoA Uplink Angle of Arrival at multi-antenna TRP Uplink 5 m typical
Multi-RTT Round-Trip Time measurements from multiple TRPs Bi-directional 1–3 m

The NLOS Problem

All geometry-based methods (TDOA, AoD, AoA, RTT) assume that signal propagation paths are line-of-sight or can be accurately modelled. In practice, rich multipath environments (indoor corridors, dense urban canyons) cause Non-Line-of-Sight (NLOS) propagation, where the first detectable signal path has bounced off walls, floors, or obstacles before reaching the receiver.

In 3GPP Indoor Factory (InF) evaluations, DL-TDOA with Rel-16 positioning achieves median error of ~1.2 m — well above the 0.5 m requirement — due to NLOS and multipath. AI/ML fingerprinting closes this gap by implicitly learning the NLOS structure of a specific deployment.

§6.2   Fingerprinting-based AI Positioning

Radio fingerprinting treats the measured channel (CSI, RSRP, CIR) as a unique spatial signature of a physical location. A neural network is trained offline to map these signatures to 3D coordinates, implicitly encoding NLOS geometry into learned feature representations.

Two-Phase Fingerprinting Pipeline

Phase 1 — Offline Training (Database Construction):

  1. Survey team (or automated robot) visits Ncal calibration positions {p1, …, pN} with known ground-truth coordinates.
  2. At each position, collect M channel snapshots: {H(fk, m)} for k = 1…K subcarriers, m = 1…Na antennas.
  3. Construct dataset {|Hi|², pi}i=1N·M and train neural network.

Phase 2 — Online Inference:

  1. UE measures current channel |H(fk)|² and reports to gNB (or runs inference locally).
  2. Trained NN maps measured feature vector to estimated position (x̂, ŷ, ẑ).
  3. Optional: uncertainty quantification output flags low-confidence estimates for fallback to TDOA.

CNN Architecture for CSI Fingerprinting

The channel amplitude matrix |H(f, a)|² ∈ ℝNa × K (antennas × subcarriers) exhibits spatial correlation analogous to a 2D image — frequency selectivity along one axis, antenna-domain spatial variation along the other. A Convolutional Neural Network (CNN) exploits this structure:

  • Input layer: |H|² ∈ ℝNa × K — channel amplitude per antenna per subcarrier.
  • Conv block 1: 32 filters, 3×3 kernel, ReLU — extract local frequency-antenna correlation patterns.
  • Conv block 2: 64 filters, 3×3 kernel, ReLU + max-pool — spatial downsampling.
  • Conv block 3: 128 filters, 3×3 kernel, ReLU — high-level spatial feature maps.
  • Global average pooling + FC(256): reduce to 256-dim feature vector.
  • Output FC(3): predict (x, y, z) in metres; linear activation.

Positioning Loss Function

The training objective combines Euclidean distance minimisation with an NLOS regularisation penalty:

Eq. 6.1 — Positioning Loss (L2 + NLOS Penalty)
$$L_{\text{pos}} = \frac{1}{N}\sum_{i=1}^{N} \|\mathbf{p}_i - \hat{\mathbf{p}}_i\|_2^2 \;+\; \alpha \cdot \text{NLOS\_penalty}$$

The NLOS penalty term discourages the network from over-fitting to NLOS-corrupted training samples. It can be implemented as a consistency loss: samples from neighbouring calibration positions should produce smoothly varying predicted positions (Lipschitz regularisation on the output manifold). The weighting coefficient α ≈ 0.05–0.2 is tuned per deployment.

§6.3   TR 38.843 Positioning Use Case — UC2

3GPP TR 38.843 Use Case 2 (UC2) evaluates AI/ML-enhanced positioning in the 3GPP Indoor Factory (InF) scenario — a representative deployment for Industry 4.0 automation where sub-meter accuracy is operationally required.

Evaluation Scenario Parameters

Parameter Value Notes
Numerology (μ) 1 30 kHz SCS
Bandwidth 100 MHz FR1 n78 band
Allocated PRBs 132 Full 100 MHz allocation
BS antenna ports 32 (64 physical) Dual-polarised 16×2 UPA
Channel feature input |H|² ∈ ℝ32 × 132 4224 input features per snapshot
CNN output dimension 256-dim vector Before final position head
Calibration dataset 1000 positions × 10 snapshots 10 000 training samples total
Inference output (x, y, z) ∈ ℝ3 3-coordinate absolute position

Positioning Accuracy Results

Method Scenario Error @ CDF 50 % Error @ CDF 90 % vs Requirement
DL-TDOA (Rel-16 baseline) InF LOS 0.6 m 1.2 m Fails 0.5 m req. @90%
DL-TDOA (Rel-16 baseline) InF NLOS 1.8 m 4.5 m Fails by 9×
AI CNN Fingerprint InF LOS 0.15 m 0.32 m Meets 0.5 m req.
AI CNN Fingerprint InF NLOS 0.22 m 0.48 m Meets 0.5 m req.
Hybrid TDOA + AI InF Mixed 0.18 m 0.41 m Best overall

Key finding: massive antenna arrays (32+ ports) provide sufficient angle diversity to make the fingerprint nearly unique at sub-50 cm resolution. The 4224-dimensional input feature space |H|² ∈ ℝ32×132 encodes both frequency-selective multipath structure and spatial beam-domain patterns — a combination that classical TDOA geometry cannot exploit.

The TR 38.843 UC2 evaluation confirmed sub-meter accuracy (< 0.5 m at CDF 90 %) for the AI fingerprinting approach in the 3GPP Indoor Factory scenario — a result that no geometry-based Rel-16/17 method could achieve under NLOS conditions. The key enabler was the availability of 32-port antenna arrays providing spatial diversity that the fingerprint can exploit as a discriminative signature.

§6.4   6G Positioning — Sub-centimetre Target

6G IMT-2030 requirements push positioning accuracy by an order of magnitude beyond 5G NR, targeting applications that 5G cannot support: factory robot arms, surgical tool tracking, and vehicular lane-level positioning.

6G Positioning Requirements

Use Case Environment Accuracy Target Dimension
Factory Automation Indoor < 10 cm 3D + Orientation
Surgical Robotics Indoor < 1 cm 6-DoF
V2X Lane-Level Outdoor < 0.5 m 3D
UAV/Drone Traffic Aerial < 1 m 3D + Heading
Extended Reality (XR) Indoor < 5 cm 6-DoF

Physical Layer Enablers for 6G Positioning

Sub-THz Bands (100–300 GHz):

The spatial resolution of any angle-based positioning method is fundamentally limited by the wavelength λ. At 140 GHz (D-band), λ = 2.1 mm — more than 100× smaller than sub-6 GHz. This translates directly to:

Large Intelligent Surfaces (LIS / RIS):

Reconfigurable Intelligent Surfaces (RIS) create a synthetic aperture effect by reflecting signals from hundreds of passive phase-shifting elements. For positioning, an RIS of area A at distance d from the UE achieves an effective angular resolution equivalent to a physical antenna array of aperture A. AI learns to optimise the RIS phase profile for maximum position-discriminability rather than signal strength.

Joint AI Positioning + ISAC:

The same ISAC waveform used for beam management (§5.5) simultaneously acts as a positioning radar. The gNB extracts both the communication channel estimate H (for fingerprinting) and the radar echo (for geometry estimation), fusing both into a joint AI positioning engine.

Cramér-Rao Lower Bound (CRLB) for AI Positioning

The fundamental limit on position estimation error — regardless of algorithm — is set by the Fisher information of the observed signal. For a wideband channel with bandwidth BW and Na receiving antennas at SNR:

Eq. 6.2 — Cramér-Rao Lower Bound (Positioning)
$$\text{RMSE}_{\text{range}} \;\geq\; \frac{c}{2\pi B_W \sqrt{\text{SNR} \cdot N_a}}$$
Scope: This is the 1D range estimation CRLB (single range measurement from one anchor). True 3D positioning CRLB requires inverting the full 3×3 Fisher Information Matrix over all anchors and their geometry. The 1D bound provides an optimistic floor only; actual 3D positioning error is higher and geometry-dependent.

This expression reveals three independent scaling levers for 6G:

  • Bandwidth BW ↑: 10 GHz at sub-THz reduces the CRLB range floor to ~1.5 cm (vs ~150 cm for 100 MHz at sub-6 GHz).
  • Antenna count Na ↑: 1024-element LIS array improves CRLB by ~5.7× vs 32-element 5G array (scales as √(Na): √1024/√32 ≈ 5.7).
  • SNR ↑: improved link budget (beamforming gain, lower noise figure at sub-THz amplifiers) directly tightens the bound.
  • AI role: the neural estimator approaches the CRLB in multipath/NLOS environments where classical MLE cannot, by learning the full posterior distribution of position given channel observations.
Analogy — Camera resolution: just as a camera with a larger lens aperture and finer pixel pitch resolves finer spatial detail, a radio system with wider bandwidth (finer "temporal pixel") and more antennas (wider spatial aperture) can resolve the UE position more precisely. AI acts as the intelligent image reconstruction algorithm that extracts maximum information from the available aperture — analogous to computational photography super-resolution.

Positioning Accuracy: 5G vs 6G Methods

The chart below compares the 90th-percentile positioning error (metres) across the key positioning paradigms from 5G to 6G, illustrating the progressive improvement enabled by AI and new physical-layer features.

Note the logarithmic y-axis: ISAC-AI achieves ~0.02 m (2 cm) in LOS — a 60× improvement over the DL-TDOA 5G baseline. Even in NLOS, the 6G ISAC-AI result (0.06 m) surpasses the 5G LOS DL-TDOA baseline (1.2 m) by 20×, demonstrating that AI + bandwidth + antenna aperture jointly break the classical NLOS barrier.

§6.5   Open Challenges in AI Positioning

Despite the impressive results from TR 38.843 and simulation studies, several engineering challenges remain before AI positioning can be deployed at 6G scale.

Calibration Data Burden

Fingerprinting requires an offline site survey: 1000 calibration positions is feasible for a single factory floor but scales poorly across tens of thousands of deployment cells. Active research directions:

Model Adaptation to Environment Change

Physical environments are not static: furniture is rearranged, new machinery is installed, seasonal changes alter multipath. A fingerprint trained in January may degrade by 50% by July in a dynamic factory. Online continual learning with forgetting prevention (Elastic Weight Consolidation, experience replay) is required to maintain accuracy without full retraining.

Privacy and Security

Privacy concern: a highly accurate AI positioning system that tracks UE location to 5 cm resolution raises significant privacy considerations. 3GPP SA3 and regulatory bodies (ETSI TC CYBER) are evaluating privacy-preserving inference architectures — federated learning, differential privacy in fingerprint databases, and on-device inference without raw CSI leaving the UE.

Latency and Signaling Overhead

Positioning inference must complete within the application latency budget:

The convergence of sub-THz bandwidth, large intelligent surfaces, ISAC sensing, and neural positioning creates a 6G positioning architecture that is fundamentally different from 5G: rather than inferring geometry from a handful of timing/angle measurements, the system constructs a continuous, AI-maintained spatial model of each deployment — one that automatically corrects for NLOS, adapts to environmental change, and approaches the physical Cramér-Rao bound even in rich-scattering environments.

[7] 3GPP TR 38.843 v18.0.0, §6.2.2 — Positioning Enhancement Use Case (UC2): AI/ML fingerprinting, CNN architecture, InF evaluation results.

[8] 3GPP TR 38.855 v16.0.0 — Study on NR Positioning Support: baseline positioning methods, accuracy requirements, NLOS analysis, Release 16/17 performance benchmarks.

Precise positioning feeds into network-level resource management. → §7 examines how AI applies these location and traffic insights to drive energy efficiency in the RAN.

§7AI for Network Energy Efficiency

The proliferation of 5G base stations and the anticipated 10–100× traffic growth toward 6G has placed network energy consumption at the centre of both operator economics and global sustainability commitments. AI-driven energy management promises to decouple traffic growth from power consumption — a critical requirement for the next decade of wireless infrastructure.

§7.1 — The Energy Crisis in Mobile Networks

Global mobile networks consumed approximately 200 TWh per year in 2020, representing roughly 0.7 % of worldwide electricity usage. With 5G densification — smaller cells, massive MIMO antenna arrays, millimetre-wave deployments — energy consumption is on track to grow substantially unless counteracted by efficiency gains. The 6G vision sets an ambitious target:

6G Energy Efficiency Target: 100× improvement in energy efficiency per bit compared to 5G NR (baseline 2020). This encompasses both the radio access network (RAN) and the core network, measured in bits per joule.

gNB Power Consumption Breakdown

Understanding where power is consumed in a gNB is the first step toward AI-guided optimisation. The breakdown below is representative of a 5G massive-MIMO macro cell 3GPP TS 28.310:

Component Share of Total Power Primary Cause AI Lever
RF / Power Amplifier (PA)~65 %Poor PA back-off efficiency at low loadSleep modes, load-adaptive PA biasing
Digital Baseband Processing~15 %FFT, MIMO detection, channel codingAlgorithm-off / clock-gating on idle symbols
Cooling / HVAC~10 %Waste heat from PA and BBUIndirectly reduced by PA/BBU savings
Other (power supply, backhaul, control)~10 %Static overheadLimited; centralised pooling helps

The dominance of the RF/PA component immediately motivates sleep-mode strategies: even a short-duration shutdown of RF chains during traffic troughs translates directly into large absolute power savings.

3GPP Energy Efficiency Metric

3GPP TS 28.310 defines a standardised energy efficiency KPI for the RAN:

Eq. 7.1 — Energy Efficiency (ηEE)
$$\eta_{EE} = \frac{\text{Data volume (bits)}}{\text{Energy consumed (J)}}$$

Measured over a reference time window (e.g., one hour). For a macro cell with 100 Mbps average throughput and 500 W average power, \(\eta_{EE} = 100 \times 10^6 / 500 = 200\) kbits/J. Improving sleep-mode penetration raises this metric directly.

Analogy — Smart Home Heating: A fixed thermostat keeps the boiler on regardless of occupancy. A predictive thermostat learns the household schedule and pre-heats just before occupants arrive, saving 20–30 % energy. AI-guided BS sleep modes operate on the same principle: anticipate low-demand windows and cut power before traffic actually drops.

§7.2 — BS Sleep Mode Prediction with AI

Base station sleep mode operation is perhaps the highest-impact single AI application in the RAN energy domain. The core challenge is prediction accuracy: waking up too late causes dropped calls; sleeping too aggressively causes coverage holes.

AI Model Architecture

Input features:
  • Traffic load history: T(t-k : t), where k may span 24–168 hours (daily/weekly periodicity)
  • Time-of-day encoding (cyclic sine/cosine features for 24h and 7d period)
  • Day-of-week and public-holiday indicator
  • Neighbouring-cell load (spatial correlation)
  • Current CQI distribution across active UEs
Output:
  • Predicted sleep duration \(\Delta t_{\text{sleep}}\) for each available sleep tier
  • Confidence interval (enables risk-aware threshold setting)
Architecture: Stacked LSTM (2–3 layers, hidden size 128–256) or a Temporal Fusion Transformer (TFT) for multi-horizon forecasting. Prophet-style decomposable models (trend + seasonality + residual) are also used for interpretability in operator dashboards.

Sleep Mode Types — 3GPP TS 38.300

3GPP TS 38.300 defines a hierarchy of sleep modes with different wake-up latencies and power savings:

Sleep Mode Wake-up Latency Power Saving AI Prediction Horizon Use Case
Symbol Shutdown<1 OFDM symbol (~71 µs)5–15 %1–10 msIdle OFDM symbols within a slot
Carrier Sleep1–10 ms30–50 %100 ms – 1 sLow-load periods within minutes
Cell Off (Deep Sleep)10–30 s70–90 %Minutes to hoursNight-time, stadium off-hours
Warning — Paging Failure Risk: Excessive sleep mode aggressiveness can cause paging failures if the cell's wake-up latency exceeds the UE's T3412 (periodic registration update) or T3324 (active timer) expiry. If a UE in RRC_IDLE attempts a paging response while its serving cell is in deep-sleep (10–30 s wake-up latency), the paging message will be missed — resulting in missed calls and dropped emergency services. AI sleep controllers must enforce a maximum sleep duration bounded by min(T3412, T3324) minus a safety margin (typically 5–10 s), and must never suppress PDCCH monitoring slots used for paging.
AI Scheduling Gain: Compared to fixed time-based schedules (e.g., "cell off 02:00–06:00"), AI prediction adds 15–25 % additional energy saving because it adapts to real-time deviations — public events, network outages, weather-driven demand shifts — that fixed schedules cannot capture.

Energy Saving Ratio (ESR)

The net energy saving when AI-controlled sleep is applied relative to an always-on baseline:

Eq. 7.2 — Energy Saving Rate (ESR)
$$\text{ESR} = 1 - \frac{E_{AI}}{E_{\text{baseline}}}$$
Eq. 7.3 — AI Energy Consumption Model
$$E_{AI} = t_{\text{sleep}} \cdot P_{\text{sleep}} \;+\; (1 - t_{\text{sleep}}) \cdot P_{\text{active}}$$

where \(t_{\text{sleep}}\) is the fraction of time in sleep state, \(P_{\text{sleep}}\) is power during sleep (typically 10–30 % of \(P_{\text{active}}\)), and \(E_{\text{baseline}} = P_{\text{always\_on}} \cdot T\). A cell spending 40 % of time in carrier-sleep at 40 % of active power achieves: \(\text{ESR} = 1 - (0.4 \times 0.4 + 0.6 \times 1.0) = 1 - 0.76 = 24\%\) saving.

Implementation Pipeline

  1. Data collection: PM counters (DL PRB utilisation, connected UE count) collected via O1 at 15-minute granularity.
  2. Model training: Offline on 90-day historical data; retrained weekly via Non-RT RIC rApp.
  3. Inference: Near-RT RIC xApp queries model every 100 ms; issues sleep/wake commands via E2 interface to gNB.
  4. Guard rails: Minimum coverage threshold enforced: if predicted load exceeds 20 % PRB utilisation in adjacent cell, cell-off mode is suppressed.

§7.3 — 3GPP SON / Coverage and Capacity Optimisation

Self-Organising Network (SON) functions have been part of 3GPP specifications since LTE, but their AI incarnation in 5G-Advanced and 6G moves from rule-based heuristics to learned policies. 3GPP TR 37.816 defines the SON framework for 5G.

Coverage and Capacity Optimisation (CCO)

  • AI model adjusts antenna tilt (mechanical or electrical remote tilt) and TX power.
  • Optimisation target: maximise coverage while minimising pilot pollution and inter-cell interference.
  • Input state: RSRP/RSRQ maps, UE distribution heat maps, handover failure rates.

Mobility Load Balancing (MLB)

  • NN distributes UEs across cells/beams to equalise load.
  • Adjusts cell individual offset (CIO) and A3/A5 handover thresholds.
  • Reduces both overloaded cell outage and underloaded cell idle power.

Multi-Agent RL for SON

Agent: Each base station (gNB or ng-eNB) is one RL agent.
State \(s_t\): Local load, interference level, neighbour-cell loads, handover KPIs.
Action \(a_t\): {antenna tilt ±2°, TX power ±3 dB, CIO ±2 dB, beam index}.
Reward function:
Eq. 8.1 — DRL Reward Function
$$R = \alpha \cdot \text{throughput} - \beta \cdot P_{\text{TX}} - \gamma \cdot \text{outage\_rate}$$
Typical weights: \(\alpha=1.0,\ \beta=0.3,\ \gamma=5.0\) (penalise outage heavily).

Algorithm: Centralised training with decentralised execution (CTDE). Global reward shared during training; at inference each agent uses only local observations.
3GPP Rel-18 SON Standardisation: 3GPP TS 28.316 (introduced in Rel-18) enables NF-based SON operations within the 5GC management framework. AI/ML models are deployed as rApps in the Non-RT RIC (training, global policy optimisation) or as xApps in the Near-RT RIC (inference, per-cell decisions at 10–1000 ms loops). This architecture allows online learning: the rApp continuously ingests new PM data and updates model weights, while xApps execute the latest policy without service interruption.

Interference Rejection via Beamforming Adaptation

Massive MIMO arrays (32–256 antenna elements) provide spatial degrees of freedom that AI can exploit for interference mitigation:

§7.4 — O-RAN Energy Optimisation Architecture

The O-RAN Alliance has defined an end-to-end architecture for AI-driven energy management that cleanly separates training (Non-RT RIC, minutes-to-hours timescale) from inference (Near-RT RIC, 10–100 ms timescale) and execution (O-DU/O-RU, symbol timescale).

Non-RT RIC (rApp)

  • Collects energy KPMs via O1 interface (TS 28.311).
  • Trains global energy optimisation model offline.
  • Pushes AI policy to Near-RT RIC via A1 interface.
  • Timescale: minutes to hours; model update: hourly or event-triggered.

Near-RT RIC (xApp)

  • Receives policy from Non-RT RIC.
  • Applies per-cell sleep decisions every 10–100 ms.
  • Uses E2 interface to command gNB sleep/wake transitions.
  • Feedback loop: reports decision outcomes back via O1.

O1 Energy KPMs

KPM NameDescriptionTypical Granularity
DL.PRB.UsedRatioDL PRB utilisation fraction (0–1)15 min PM window
RF.TX.PowerAvgAverage DL TX power (dBm)15 min PM window
Cell.DowntimeRatioFraction of time cell was in sleep/off state15 min PM window
RRC.ConnSuccRateUE connection success rate15 min PM window
Energy.Consumed.kWhEnergy meter reading (where available)1 hour
Interface Separation: 3GPP SA5 (TS 28.311) and O-RAN WG1 jointly standardise energy efficiency NF interfaces. The Non-RT RIC receives energy KPMs via O1, trains a global model capturing multi-cell spatial and temporal correlations, and pushes policy updates to Near-RT RIC via A1 — a clean separation of offline training from online inference. This avoids the latency and stability issues that arise when training and inference share the same loop.

Energy Optimisation — End-to-End Data Flow

  1. O-RU / O-DU: measures RF output power, temperature, and traffic load every slot.
  2. O-DU → SMO/O1: aggregates 15-min PM reports; energy meter if equipped.
  3. Non-RT RIC rApp: ingests PM data, trains LSTM forecaster and MARL policy.
  4. A1 policy push: serialised policy (e.g., sleep thresholds, cell-off schedule) pushed to Near-RT RIC.
  5. Near-RT RIC xApp: queries policy, evaluates current cell state, issues sleep/wake command via E2 SM (E2 Service Model for RAN control).
  6. gNB RRC: executes sleep transition; broadcasts updated SIB if coverage changes.
  7. Outcome reporting: xApp logs decision outcome; rApp updates model with new reward signal (online RL).

Quantitative Impact

Mechanism Energy Saving vs Always-On AI Gain vs Fixed Schedule Coverage Impact
Symbol Shutdown (AI-guided)5–15 %+3–5 %Negligible (PDCCH always on)
Carrier Sleep (AI-guided)30–50 %+8–12 %<1 dB RSRP degradation
Cell Off (AI-guided)70–90 %+15–25 %Neighbour cells compensate
CCO tilt optimisation5–10 %+4–8 %+0.5–1 dB coverage improvement
MLB load balancing8–15 %+5–10 %Improved edge UE throughput

Figure 7.1 — Energy savings (%) for different sleep mode tiers, comparing fixed schedules versus AI-guided prediction. AI prediction consistently adds 15–25 % relative improvement across all tiers.

Caution — Coverage Continuity: Cell-off mode requires careful coordination. Before switching off a macro cell, the AI controller must verify that: (1) all active UEs have been handed over to neighbouring cells, (2) emergency call coverage is maintained (regulatory requirement), and (3) no isolated UE exists without an alternative serving cell. Failure to enforce these constraints can cause call drops and regulatory violations.
[9] 3GPP TS 28.310 v18.0.0 — Energy efficiency of 5G.
[10] 3GPP TR 37.816 v16.0.0 — Study on SON and the O&M aspects for 5G networks.
O-RAN Alliance WG1 — Energy Saving Technical Report v04.00 (2023).

AI energy management optimises when and where the network transmits; → §8 tackles how spectrum resources are allocated efficiently — the radio resource management and scheduling problem.

§8AI for Radio Resource Management and Scheduling

Radio Resource Management (RRM) is the real-time control plane of the RAN: allocating PRBs, selecting modulation and coding schemes, managing MIMO layers, and balancing load across cells. Classical schedulers, designed for single-cell optimisation with limited context, are inadequate for the densely heterogeneous, interference-coupled topology of 6G. AI-native schedulers promise to replace hand-crafted heuristics with learned policies that generalise across environments and adapt to changing channel conditions.

§8.1 — Classical Scheduling — Capabilities and Limitations

5G NR supports three canonical scheduler families, each representing a different trade-off between spectral efficiency and fairness:

Scheduler Objective CQI Dependency Key Limitation
Round Robin (RR) Equal PRB time share None Spectral efficiency ignored; poor for mixed UE geometries
Max-C/I Maximise instantaneous throughput Full CQI required Starves cell-edge UEs; fairness index ≈ 0.5
Proportional Fair (PF) \(\max \sum_k \log R_k\) CQI feedback every 5–10 ms Reactive only; no interference prediction; single-cell scope

The Proportional Fair scheduler maximises the sum of logarithmic rates, which is equivalent to maximising the Nash bargaining solution for resource allocation fairness. However, it is strictly reactive:

The Core Problem: Classical schedulers optimise a single time step using locally available information. 6G networks require multi-step, multi-cell, multi-objective optimisation — precisely the problem class for which deep reinforcement learning was designed.

§8.2 — Deep Reinforcement Learning Scheduling

The scheduling problem maps naturally onto the Markov Decision Process (MDP) framework, enabling direct application of deep RL techniques.

MDP Formulation

State \(s_t\):
  • CQI per UE per PRB group (wideband or subband, normalised to [0,1])
  • Buffer occupancy per UE (bytes, normalised to maximum buffer size)
  • QoS class indicator (QCI/5QI) and remaining latency budget per UE
  • Historical throughput ratio \(R_k(t) / \bar{R}_k\) (PF score history)
  • Inter-cell interference estimate (from neighbouring cell X2/Xn reports)
Action \(a_t\):
  • PRB allocation mask per UE (which PRBs assigned to which UE)
  • MCS index per UE (0–27 for 5G NR)
  • MIMO rank and precoding matrix indicator (PMI)
  • Beam index (for mmWave / massive MIMO)
Reward \(r_t\):
Eq. 8.2 — Multi-Agent Reward Signal
$$r_t = \alpha \cdot \text{throughput}_t + \beta \cdot \text{fairness\_index}_t - \gamma \cdot \mathbf{1}[\text{latency\_violated}]$$
Typical: \(\alpha=1.0,\ \beta=0.5,\ \gamma=10.0\). The hard penalty \(\gamma\) for latency violations ensures QoS constraints are respected in the learned policy.

DQN-Based Scheduler

For discrete action spaces (quantised PRB allocations), a Deep Q-Network is applied:

Eq. 8.3 — DQN Bellman Update
$$Q(s_t, a_t;\, \theta) \;\leftarrow\; r_t + \gamma \max_{a'} Q(s_{t+1}, a';\, \theta^-)$$

where \(\theta^-\) denotes the frozen target network updated every \(C=100\) steps (DQN target network trick). Experience replay buffer size: 105 transitions. Mini-batch size: 64. The action space is factored by UE to avoid combinatorial explosion: each UE's PRB allocation is decided by a separate Q-head sharing a common state encoder.

Simulated Performance vs PF Scheduler

Metric PF Scheduler DQN Scheduler Gain
50th %ile UE throughput (Mbps)42.147.2+12.1 %
5th %ile (cell-edge) UE throughput (Mbps)8.38.97+8.1 %
QoS satisfaction rate (%)88.095.0+7 pp
Latency violation rate (%)4.21.8−57 %
Jain's fairness index0.870.91+4.6 %

Simulation: 132 PRBs, 30 kHz SCS, 32T32R massive MIMO, 20 UEs per cell, 3GPP 3D-UMa channel model, 10 MHz × 100 ms evaluation window.

Why DQN Outperforms PF: The DQN policy implicitly learns a form of predictive scheduling — by observing historical buffer occupancy and CQI trends, it pre-allocates resources slightly ahead of demand peaks. It also naturally incorporates QoS differentiation through the latency-penalty reward, something PF cannot do without complex priority heuristics.

Actor-Critic for Continuous Action Spaces

When the action space is treated as continuous (e.g., power fractions, beamforming vectors), Proximal Policy Optimisation (PPO) or Soft Actor-Critic (SAC) is preferred:

§8.3 — Multi-Cell Coordination with Multi-Agent RL

Single-cell scheduling, even with deep RL, cannot resolve inter-cell interference because each gNB observes only its own UEs and local channel state. Multi-Agent RL (MARL) extends the framework to coordinate decisions across a cluster of base stations.

Problem: Inter-Cell Interference in Dense Networks

MARL Architecture — MADDPG

Algorithm: Multi-Agent Deep Deterministic Policy Gradient (MADDPG)
Agent: Each base station \(k \in \{1,\ldots,K\}\) maintains actor \(\pi_k(\cdot; \theta_k)\) and critic \(Q_k(\cdot; \phi_k)\).

Key CTDE (Centralised Training, Decentralised Execution) property:
  • Training: Critic \(Q_k\) receives the joint action \(a_{1:K,t}\) and joint state \(s_{1:K,t}\) — full information available offline.
  • Execution: Actor \(\pi_k\) uses only local observation \(o_{k,t}\) — practical for deployment.
Cooperative objective:
Eq. 8.4 — MARL Policy Gradient Objective J(θ)
$$J(\theta) = \mathbb{E}\!\left[\,\sum_{t=0}^{T} \gamma^t \sum_{k=1}^{K} r_k(s_t,\, a_{1:K,t})\,\right]$$
where \(r_k\) includes both local throughput and a shared inter-cell interference penalty \(-\lambda \cdot \text{ICIC}_t\). The shared penalty aligns individual agent objectives with the global system optimum.
Warning — MARL Convergence Instability: Multi-agent RL convergence is not guaranteed in non-stationary multi-cell environments. Because each agent's policy is simultaneously updating, the environment appears non-stationary from any single agent's perspective — violating the Markov assumption that underpins Q-learning convergence proofs. In practice, this manifests as oscillating PRB allocations, cyclic interference patterns, or diverging reward signals when traffic load shifts rapidly (e.g., a stadium event ending). Mitigation strategies include: (1) centralised critic stabilisation (CTDE), (2) policy update rate limiting to ensure pseudo-stationarity, (3) experience replay with prioritised older transitions during non-stationary periods, and (4) conservative policy gradient clipping (PPO-style) to bound per-step policy change magnitude.
MADDPG vs MAPPO vs MAAC — Choosing the Right MARL Algorithm:
  • MADDPG (Lowe et al., NeurIPS 2017): off-policy, DDPG-based, good sample efficiency, can be unstable with many agents.
  • MAPPO (Yu et al., NeurIPS 2021): on-policy, PPO-clipped, better training stability and cooperative task performance — recommended baseline for 6G deployments.
  • MAAC (Iqbal & Sha, ICML 2019): attention-based centralised critic, handles dynamic K (variable number of active cells), suited for heterogeneous 6G topologies.
Practical guidance: use MAPPO first (stable, easy to tune); switch to MAAC if the number of coordinating nodes varies at runtime.

6G Ultra-Dense HetNet Context

6G deployments will combine macro cells, pico cells, femto cells, and Reconfigurable Intelligent Surfaces (RIS) in a single coordinated network:

Coordination Scale

  • MARL coordinates 10–100 cells simultaneously.
  • Hierarchy: cluster head (macro) aggregates local decisions from pico/femto agents.
  • RIS controller is a zero-power agent (no active TX) optimising phase shifts.

Performance Gains

  • +5–20% system throughput vs single-cell DRL; +17–43% vs Proportional Fair (load-dependent).
  • −40–60 % inter-cell interference at cell edge.
  • Fairness index improvement: 0.83 → 0.94 (Jain's index).

Graph Neural Network Extensions

Recent work models the cellular network as a graph: nodes are base stations, edges are interference links. A Graph Neural Network (GNN) encodes spatial interference structure directly into the RL state representation:

3GPP Rel-18 Data Collection for AI/ML (TR 37.817): To enable multi-cell MARL, base stations need to share local observations. 3GPP TR 37.817 studies enhancement of data collection mechanisms for NR and ENDC, including UE measurement reporting extensions and inter-gNB measurement sharing via the Xn interface. This standardised data collection substrate is a prerequisite for deploying MARL schedulers in live networks.

§8.4 — 6G Integrated Access and Backhaul (IAB) Scheduling

Integrated Access and Backhaul (IAB) — specified in 3GPP TR 38.874 — allows gNBs to serve both UEs (access link) and relay traffic to/from the core via wireless backhaul over the same NR spectrum. AI scheduling in IAB networks addresses the unique challenge of joint access–backhaul resource management.

IAB Topology Optimisation

  • Parent selection: RL policy selects optimal parent node for each IAB relay, trading off backhaul capacity vs coverage.
  • Dynamic re-parenting: AI triggers topology change when link quality degrades (blockage, mobility).
  • Cycle prevention: Graph-constraint layer in RL policy ensures acyclic routing tree.

Backhaul Pre-Allocation

  • Traffic prediction (LSTM on PM counters) forecasts access-link demand 100–500 ms ahead.
  • Pre-allocates backhaul PRBs in advance to avoid buffering stalls.
  • TDD frame structure adapted: backhaul slots and access slots proportioned dynamically by AI.

Hybrid TDD+FDD Policy with RL

Some 6G IAB deployments operate dual-band (sub-6 GHz FDD for access, mmWave TDD for backhaul). An RL policy manages cross-band resource coupling:

Analogy — Airport Ground Traffic: An IAB relay node is like an airport with both arriving flights (access UEs) and connecting flights (backhaul relay traffic). A static gate assignment (fixed TDD ratio) wastes gates when one traffic type is light. AI-driven dynamic allocation is the equivalent of real-time gate management based on flight delay predictions — continuously rebalancing to minimise total delay.

Multi-Hop IAB Scheduling

For multi-hop IAB chains (gNB-donor → IAB-1 → IAB-2 → UE), the scheduling problem becomes a pipeline optimisation:

Eq. 8.5 — Tandem Backhaul Rate
$$R_{\text{UE}} = \min\!\left(\,R_{\text{access}},\; \frac{R_{\text{bh,1}} \cdot R_{\text{bh,2}}}{R_{\text{bh,1}} + R_{\text{bh,2}}}\,\right)$$

The effective UE rate is bottlenecked by the weakest link. An AI scheduler identifies the bottleneck hop and preferentially allocates resources to relieve it — behaviour that emerges naturally from an end-to-end reward but cannot be captured by hop-by-hop greedy schedulers.

Figure 8.1 — System throughput (Gbps) vs number of active UEs for Proportional Fair, single-cell DRL, and MARL schedulers. Simulation: 132 PRBs, 30 kHz SCS, 32T32R massive MIMO, 3GPP 3D-UMa channel. MARL benefits increase with UE count due to greater inter-cell interference at higher load.

Deployment Path to Live Networks

Transitioning AI schedulers from simulation to live 5G/6G networks involves several practical steps governed by 3GPP and O-RAN specifications:

Phase Activity 3GPP/O-RAN Anchor Risk
Phase 1 — Shadow Mode AI scheduler runs in parallel with PF; no actual resource assignment O-RAN WG2 A1/E2 interfaces Low — no UE impact
Phase 2 — A/B Test Subset of cells (5–10 %) use AI scheduler; rest use PF TS 28.552 PM framework Medium — monitor KPIs closely
Phase 3 — Controlled Rollout Expand to 50 % with automatic rollback trigger TS 28.316 SON management Medium — requires rollback automation
Phase 4 — Full Deployment 100 % cells; continuous online learning via rApp O-RAN ML workflow (WG2) Ongoing model drift monitoring required
Model Drift in Live Networks: A scheduler policy trained on summer traffic patterns may perform sub-optimally in winter (different mobility, indoor/outdoor split). Continuous model monitoring — tracking the KL-divergence between training-time and deployment-time state distributions — and periodic retraining are essential operational requirements, not optional enhancements.
6G Vision — AI-Native Air Interface: Beyond 5G-Advanced, 6G targets an AI-native air interface where the scheduler, beamformer, channel estimator, and HARQ manager are jointly trained as a single end-to-end neural network. This collapses the layered protocol stack into a unified learned policy, potentially achieving throughput closer to the theoretical limits of multi-user MIMO information theory. 3GPP TR 38.843 initiated the study of AI/ML for air interface in Rel-18, with normative work expected in Rel-19/20.

§8.5 — AI-based HARQ Retransmission Prediction

Motivation: Classical HARQ operates reactively — the gNB transmits a transport block, waits for ACK/NACK feedback (RTT ≈ 4–8 ms in NR), and only then decides to retransmit or advance to a new block. This RTT imposes a latency floor that is increasingly problematic for URLLC and XR use cases targeting sub-5 ms end-to-end delay. AI-based HARQ prediction turns this reactive cycle into a proactive resource management strategy.

The key insight is that HARQ failure probability is not uniformly distributed — it is strongly correlated with observable radio conditions (CQI trend, RSRP, interference level, UE velocity) in the immediately preceding slots. A classifier trained on these features can predict, before transmitting, whether a given transport block will likely require retransmission.

Proactive Resource Pre-positioning

When the AI predicts a high retransmission probability for a scheduled UE, the scheduler can:

ApproachLatency impactComplexity3GPP status
Classical HARQ (Chase / IR) Full RTT (4–8 ms) per retransmission Low (fixed protocol) Normative (TS 38.212/213)
AI-HARQ prediction (proactive MCS back-off) 0 ms added — avoids retransmission Light classifier (MLP or GBT) Under study (TR 38.843, Rel-19)
AI-HARQ prediction (pre-reserved IR) RTT removed for predicted failures Scheduler integration required Research / O-RAN xApp proposals
Performance note: Simulation studies (O-RAN Alliance WG2 contributions, 2023) report URLLC latency violation reduction of 30–45 % when AI-HARQ prediction is combined with proactive MCS selection, relative to standard PF scheduling with HARQ. The benefit is largest for high-velocity UEs where channel variations between transmission and ACK receipt are most severe.

§8 Summary

Key Takeaways — §8
  • Classical PF scheduling is optimal for single-cell, single-step scenarios but cannot handle multi-cell interference or QoS heterogeneity.
  • DQN/PPO-based schedulers achieve +8–12 % cell-throughput gain and +7 pp QoS improvement over PF in 5G NR simulations.
  • MARL (MADDPG) with CTDE achieves a further +18–30 % gain by resolving inter-cell interference through cooperative policy optimisation.
  • IAB AI scheduling unifies access and backhaul resource management, essential for 6G ultra-dense relay topologies.
Open Research Challenges
  • Sample efficiency: live network training is slow; sim-to-real transfer and meta-RL are active research areas.
  • Action space explosion: 100+ UEs × 132 PRBs × MCS levels exceeds practical Q-table sizes; factored and hierarchical RL needed.
  • Interpretability: regulators may require explainable scheduling decisions for QoS audit.
  • Standardisation: E2 SM for RL-based schedulers is not yet finalised in O-RAN.
[11] 3GPP TR 38.874 v16.0.0 — Study on Integrated Access and Backhaul.
[12] 3GPP TR 37.817 v17.1.0 — Study on enhancement for Data Collection for NR and ENDC.
3GPP TR 38.843 v18.0.0 — Study on AI/ML for NR air interface.
O-RAN Alliance WG2 — AI/ML workflow description and requirements v04.00 (2023).
R. Lowe et al., "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments," NeurIPS 2017 — foundational MADDPG reference for cooperative multi-agent policy gradient.

Optimising how bits move through the RAN is one dimension of AI-native 6G. → §9 takes a more radical step: rethinking what is transmitted, moving from bit-pipe semantics to goal-oriented and semantic communications.

§9   Semantic & Goal-Oriented Communications

§9.1   Beyond Shannon: The Semantic Layer

Classical communications theory, as formulated by Shannon in 1948, defines a single objective: transmit bits reliably across a noisy channel. The celebrated capacity formula encodes that objective in one line:

Eq. 9.1 — Shannon Capacity (Baseline)
$$C \;=\; B \cdot \log_2\!\bigl(1 + \mathrm{SNR}\bigr)$$

Every generation of cellular technology — from 2G voice codecs to 5G LDPC and polar codes — has pushed relentlessly toward this bound. Having nearly reached it for individual links, 6G asks a different question: is reliable bit delivery the right goal?

Semantic communications reframe the problem. If both transmitter and receiver share a background knowledge base K, the channel only needs to convey the difference between the message and what the receiver already knows — a concept pioneered by Weaver in 1949 but now made practical by large neural networks acting as shared world-models. In image and video trials this alone reduces effective transmitted information by 90 % while preserving communicative fidelity.

Three Levels of Communication (Shannon & Weaver, 1949)

Level Question answered Metric 6G support
1 — Technical Were bits transmitted correctly? BER, BLER, Shannon capacity Inherited from 5G-Advanced
2 — Semantic Was meaning transmitted correctly? BLEU, BERT-score, semantic similarity New in 6G SA1 study item
3 — Effectiveness (Goal) Was the intended effect achieved? Task accuracy, control error, MOS New in 6G SA1 study item

5G and earlier generations operate exclusively at Level 1. 6G natively incorporates Levels 2 and 3 into the radio access network design, creating new KPIs and new protocol layers that did not exist in prior 3GPP releases.

Analogy — video call over Shannon vs. semantic channel. Classical 5G transmits every pixel of every frame, compressed by H.265 to perhaps 2 Mbps. A semantic system would instead transmit a compact latent representation: "speaker said hello, nodded, smiled, background unchanged" and reconstruct the video at the receiver using a shared generative model. The channel payload drops to roughly 20 kbps — a 100× reduction — while the communicative intent is perfectly preserved. Errors in the semantic channel degrade plausibility, not just pixel values.

§9.2   Joint Source–Channel Coding (JSCC)

Shannon's separation theorem guarantees that, in the limit of infinite block length, independently optimising source compression and channel coding achieves capacity. In practice, block lengths are finite, latency is bounded, and the source is often correlated with a task at the receiver. Joint Source–Channel Coding (JSCC) breaks the abstraction barrier and co-designs both stages as a single end-to-end learned system.

DeepJSCC Architecture

Transmitter path

  • Input: source signal x (image, audio, sensor)
  • Encoder network fe(·; θe)
  • Output: complex baseband vector z ∈ ℂk
  • Power normalisation to meet PTX

Receiver path

  • Input: noisy received vector ŷ = Hz + n
  • Decoder network fd(·; θd)
  • Output: reconstruction
  • Optional: task head for downstream inference

The JSCC training loss jointly penalises reconstruction distortion, transmit power, and semantic distortion:

Eq. 9.2 — JSCC Loss Function
$$\mathcal{L}_{\mathrm{JSCC}} \;=\; d\!\bigl(x,\,\hat{x}\bigr) \;+\; \lambda_1 \cdot P_{\mathrm{TX}} \;+\; \lambda_2 \cdot \mathbb{E}\!\left[\mathcal{D}_{\mathrm{sem}}(x,\hat{x})\right]$$

where:

Performance: Classical Pipeline vs. DeepJSCC

The most dramatic difference appears at low SNR. A conventional pipeline (JPEG2000 source compression + LDPC channel coding) exhibits a cliff effect: image quality is high above a threshold SNR, then collapses catastrophically as the channel deteriorates below that threshold. DeepJSCC shows graceful degradation — quality reduces smoothly, never catastrophically.

System SNR = 10 dB (PSNR) SNR = 2 dB (PSNR) SNR = −2 dB (PSNR) Cliff?
JPEG2000 + LDPC (0.1 bpp) 34.2 dB 33.8 dB < 5 dB (collapse) YES
DeepJSCC (same bandwidth) 35.1 dB 30.5 dB 25.2 dB NO
DeepJSCC + semantic head 36.4 dB 31.9 dB 26.8 dB NO
At the semantic level, the bandwidth advantage is even larger. Semantic JSCC achieves near-perfect scene reconstruction (correct objects, positions, actions) at 1/10th the bandwidth of pixel-faithful transmission when a shared generative prior is available at both ends.
Practical deployment limitation — channel model mismatch. Current JSCC implementations are trained for a fixed channel model (e.g., AWGN). Mismatch between training and deployment channel (e.g., frequency-selective fading) can degrade performance below the digital baseline — eliminating the graceful-degradation advantage that motivates the approach. 3GPP Rel-19 is studying model generalisation as part of TR 38.843 Phase 2, with the aim of defining mandatory cross-scenario robustness requirements before JSCC can be considered for normative inclusion.

§9.3   Task-Oriented Communications

In machine-to-machine 6G scenarios — factory automation, vehicular sensing, drone swarms — there is no human receiver interpreting the content. The communication exists solely to enable a downstream task: binary classification, object detection, state estimation, or control command generation. Optimising for bit fidelity in these scenarios is not just suboptimal — it is the wrong objective function.

Formulation

Let x be a raw feature (sensor frame, LiDAR point cloud, RF sample). The downstream task model ftask maps a reconstruction to label ŷ. The task-oriented transmission loss is:

Eq. 9.3 — Task-Oriented Communication Loss
$$\mathcal{L}_{\mathrm{task}} \;=\; \mathcal{L}_{\mathrm{CE}}\!\bigl(f_{\mathrm{task}}(\hat{x}),\; y_{\mathrm{label}}\bigr) \;+\; \lambda \cdot R$$

where R is the transmission rate (bits per channel use) and CE is the cross-entropy classification loss. The encoder learns to discard pixels that do not influence task accuracy — a strict information-theoretic compression beyond any hand-designed codec.

3GPP Alignment: TR 22.874 AI/ML Data Communication

3GPP SA1 TR 22.874 §5.5 — "AI/ML data communication" — defines the use case where a UE acquires sensor data and transmits compressed feature representations to an edge server, which runs inference and returns a decision. This exactly corresponds to task-oriented communications in the academic literature.

Key quantified benefits from 3GPP feasibility studies:

Scenario Data type Uplink reduction Task accuracy
Factory camera QA HD image (2 MP) 70–80 % 98.2 % vs. 98.5 % (raw)
Vehicular LiDAR Point cloud (65k pts) 85–92 % 94.1 % vs. 94.8 % (raw)
Drone RF sensing IQ samples (1 ms burst) 90–95 % 96.3 % vs. 97.0 % (raw)

The 1–2 % accuracy loss is traded for a 10–20× reduction in uplink channel occupancy — a favourable exchange in dense IoT deployments where radio resources are the binding constraint.

§9.4   3GPP Semantic Communications Study Items

3GPP has formally opened study items on semantic and goal-oriented communications in Release 19 under SA1, with the intent to define new KPI classes and service requirements. This section maps academic concepts to 3GPP specification artefacts.

Key Specification References

Reference Scope Release
TR 22.874 §5.5 AI/ML data communication use case; feature compression; edge inference Rel-18
TR 22.874 §6.3 AI/ML model lifecycle: training, transfer, update, versioning Rel-18
SA1 Rel-19 SI Semantic and goal-oriented communications — KPI framework Rel-19
SA2 Rel-19 WI Architecture for semantic layer (proposed): semantic entity, knowledge DB Rel-19 (study)

Proposed Semantic KPI Candidates

Language / Text modality

  • BLEU score — n-gram overlap between transmitted and reconstructed text
  • BERTScore — cosine similarity in transformer embedding space; captures paraphrase
  • Semantic accuracy — intent classification accuracy after reconstruction

Image / Video / Sensor modality

  • Task accuracy — object detection mAP, classification Top-1/Top-5
  • MOS (Mean Opinion Score) — perceptual quality for media use cases
  • Control error — RMS deviation for closed-loop actuation tasks
Key insight — KPI gap. None of these metrics exist in current 3GPP TS 38-series performance requirements. Introducing semantic KPIs requires new measurement methodologies, reference models, and conformance test procedures. This is an open standardisation challenge for Rel-19/20.

Figure 9-1 — Bandwidth–Fidelity Trade-off: Classical vs. JSCC vs. Semantic

[13] 3GPP TR 22.874 v18.0.0 — Study on traffic characteristics and performance requirements for AI/ML model transfer, 3GPP SA1, 2023.
Study checkpoint — §9.
  1. What is the fundamental difference between Level 1 (technical) and Level 2 (semantic) communication goals?
  2. Explain the cliff effect in classical coded transmission. Why does DeepJSCC avoid it?
  3. Write the task-oriented loss function and identify which term drives the encoder to discard task-irrelevant information.
  4. Name two proposed semantic KPIs from the 3GPP SA1 Rel-19 study item and explain how each is measured.
  5. A factory deploys 1000 cameras each transmitting 2 MP images at 30 fps. Using a 70 % uplink reduction figure, calculate the released capacity in Gbps if raw transmission requires 120 Mbps per camera.

Semantic communications redefine the purpose of the link; → §10 takes the final step — replacing the entire classical transceiver with an end-to-end learned system in the Native AI Interface paradigm.

§10   Native AI Interface in 6G

§10.1   The End-to-End Learning Paradigm

Every component in a classical communications chain — modulator, forward error correction encoder, channel estimator, equaliser, FEC decoder, demodulator — was designed independently, each individually optimal under idealised assumptions about the channel and the adjacent blocks. The assembled pipeline performs well only when those assumptions hold simultaneously.

End-to-end (E2E) learning abandons that decomposition entirely. The entire transmitter–channel–receiver is modelled as a differentiable autoencoder and trained jointly with stochastic gradient descent (SGD) on a single objective: minimise message error probability over the real channel distribution. O'Shea & Hoydis (2017) demonstrated the concept on AWGN and fading channels, showing that a learned autoencoder rediscovers classical constellations and can exceed them on short block lengths.

Autoencoder Transceiver Architecture

Transmitter (encoder)

  1. One-hot message vector m ∈ {0,1}M
  2. Dense layer (M → 2N) + BatchNorm
  3. Power normalisation: ss / √(𝔼[‖s‖²])
  4. Complex baseband signal s ∈ ℂn

Receiver (decoder)

  1. Received vector r = Hs + n
  2. Dense layers (2N → 2N → M)
  3. Softmax activation → class probabilities
  4. Hard decision: = argmaxi pi

The E2E training objective is the cross-entropy loss over the message alphabet :

Eq. 10.1 — E2E Autoencoder Loss (Cross-Entropy)
$$\mathcal{L}_{\mathrm{E2E}} \;=\; -\sum_{m\,\in\,\mathcal{M}} p(m)\,\log P\!\bigl(\hat{m} = m \,\big|\, m\bigr)$$

Back-propagation through the decoder, then through a differentiable channel model (or a surrogate for non-differentiable hardware), and finally into the encoder adjusts both ends simultaneously. The trained encoder weights define a learned signal constellation; the trained decoder weights define a near-MAP detector for that constellation and channel.

Empirical Performance on AWGN

Configuration Code rate FER = 10−2 at SNR vs. baseline
BPSK + Hamming(7,4) 4/7 ≈ 0.57 6.2 dB baseline
Autoencoder (n=7, k=4) 4/7 ≈ 0.57 4.7 dB −1.5 dB
QPSK + uncoded 2 bpcu 5.1 dB
Autoencoder (n=2, k=4) 2 bpcu 3.6 dB −1.5 dB
For short block lengths (k = 4 bits, n = 2 complex symbols), the autoencoder achieves Frame Error Rate below 10−4 at an SNR 1.5 dB lower than the BPSK baseline. This gain closes in for longer codes where classical turbo/LDPC asymptotically approach capacity — making E2E learning especially relevant for 6G ultra-reliability short-packet use cases (URLLC+ / XR control links).

§10.2   Constellation Learning

One of the most striking outputs of E2E training is a learned signal constellation that adapts its geometry to the statistics of the channel — something no classical codebook can do at deployment time.

Channel-Specific Adaptation

Channel type Learned constellation shape Reason
AWGN Hexagonal lattice ≈ QAM Euclidean distance maximised; matches sphere-packing bound
Rayleigh flat fading Near-PSK ring structure Phase-invariant to random amplitude; concentrates energy on circle
ISI channel (multipath) Pre-distorted / equalized Encoder pre-inverts channel; constellation accounts for inter-symbol overlap
Adversarial (eavesdropper) Obfuscated / non-standard Encoder hides structure from passive observer; physical layer security

Constrained Constellation Design

The formal optimisation is a mutual-information maximisation under a power constraint:

Eq. 10.2 — Mutual Information Maximisation
$$\max_{\theta}\; I(\mathbf{s};\,\mathbf{r}) \quad\text{subject to}\quad \mathbb{E}\!\left[\|\mathbf{s}\|^2\right] \leq P,\quad \mathbf{s} \in \mathbb{C}^n$$

The mutual information I(s; r) is not analytically tractable for arbitrary channel distributions. Two practical approaches are used:

  1. Differentiable surrogate: Lower-bound I with a Gaussian auxiliary channel approximation; back-propagate through the bound. Fast convergence; slightly suboptimal for highly non-Gaussian channels.
  2. Reinforcement learning (RL) over hardware channel: Treat the real channel as a black-box reward function. Policy-gradient or evolutionary strategies optimise the constellation without a channel model. Enables optimisation directly on hardware — relevant for mmWave/sub-THz where accurate simulation is difficult.
Rel-18 / 6G Research

3GPP RAN1 has begun feasibility studies on AI-based constellation shaping for 6G physical layer. The key open question is whether learned constellations can be represented compactly enough for standardised signalling — or whether each device pair must negotiate constellation parameters over an out-of-band channel.

Regulatory challenge — learned constellations and spectral conformance. Learned constellations may not conform to spectral emission masks or coexistence requirements defined by regulatory bodies. 3GPP SA1 TR 22.874 §7 requires AI outputs to be bounded within operator-defined conformance envelopes — meaning any constellation generated by an E2E autoencoder must be projected onto the feasible set defined by the applicable RF emission mask before transmission. This projection step can reduce the MI advantage of a freely learned constellation by 0.5–2 dB, depending on how tightly the mask constrains the signal. Designing training losses that natively incorporate spectral mask constraints (e.g., via a differentiable mask-penalty term) is a critical open problem for any E2E autoencoder deployed in licensed spectrum.

§10.3   The Generalisation Challenge

E2E learning achieves strong results on the channel distribution it was trained on. The critical weakness is domain shift: performance degrades when the deployment channel differs from the training distribution. This is not a minor engineering concern — it is an existential challenge for standardised AI transceivers.

Sources of Domain Shift in 6G Deployments

Mitigation Strategies

1 — Domain Randomisation
Train over a broad distribution of channel realisations: CDL-A/B/C/D/E, SNR ∈ [−10, 30] dB, velocity ∈ [0, 500] km/h. Model learns a robust policy; sacrifices peak performance on any single channel for resilience across all.
2 — Meta-Learning (MAML)
Model-Agnostic Meta-Learning trains a parameter initialisation θ* such that a small number of gradient steps on pilot symbols at the new site reaches good performance. Adaptation requires only ~10–50 pilots — compatible with 5G NR reference signal overhead budgets.
3 — Transfer Learning
Pre-train on synthetic CDL channels; fine-tune on real over-the-air samples collected during initial deployment. Encoder/decoder lower layers (generic features) are frozen; only upper layers are updated. Reduces fine-tuning data requirement by 10–100×.
4 — Online Adaptation
Continue gradient updates during deployment using decoded-then-re-encoded symbols as pseudo-labels. Risk: catastrophic forgetting if adaptation rate is too high. Mitigation: elastic weight consolidation (EWC) penalty.

3GPP Implication: Model Update Mechanism

TR 22.874 §6.3 defines the AI/ML model lifecycle: training, validation, deployment, monitoring, and update trigger. For E2E transceivers, this implies that the network must support secure, versioned delivery of updated model weights to UEs — a new type of system information or dedicated model-distribution bearer.
Warning — Regulatory and testing implications. An AI transceiver that continues adapting after certification may drift outside its conformance-tested operating envelope. 3GPP must define either (a) hard constraints on post-deployment adaptation, or (b) new certification procedures that bound worst-case post-adaptation behaviour. This remains an open standards gap in Rel-19.

§10.4   Hybrid AI + Model-Based Design

A fully learned transceiver is not standardisable in the 3GPP sense: each trained model produces a unique, deterministic mapping from bits to waveforms that cannot be replicated by a compliant implementation from another vendor without access to the exact same weights. Interoperability — the cornerstone of cellular standards — would be lost.

The hybrid approach resolves this tension by maintaining the standardised structural skeleton of OFDM while replacing computationally intensive DSP blocks with AI components at well-defined interfaces.

What Stays Classical vs. What Becomes AI

Layer / Block Classical (standardised) AI-enhanced 3GPP status
Waveform OFDM + CP (TS 38.211) Fixed for interoperability
Resource grid Slot/subframe structure Fixed for interoperability
Channel estimation LS / MMSE on DMRS NN interpolator (§5 of this document) AI allowed (Rel-18 WI)
Equalization ZF / MMSE-IRC Deep unfolded LISTA AI allowed (Rel-18 WI)
Symbol detection Hard/soft ML demapper NN demapper (learned LLRs) AI allowed (Rel-18 WI)
FEC decoder LDPC / Polar BP Neural BP (learnable edge weights) Under study (Rel-19)
HARQ Chase combining / IR Fixed (impacts scheduler)
MAC scheduler Rule-based PF/RR RL-based scheduler (§7 of this document) AI allowed (non-standard)
RRC state machine Defined per 3GPP ASN.1 Fixed (protocol correctness)

This stratification — standardise the interface, liberalise the implementation — is the architectural principle adopted in 3GPP Rel-18 for AI-assisted RAN functions. It permits competitive AI implementations from multiple vendors while guaranteeing interoperability at the waveform and protocol levels.

Research direction — AI-designed waveforms (beyond CP-OFDM): While 3GPP standardisation fixes the CP-OFDM waveform for interoperability, academic research explores E2E autoencoder systems that learn the pulse shaping filter jointly with the constellation and equaliser. In these systems the transmitter encodes bits into a time-domain waveform whose shape is a free parameter optimised by SGD — not constrained to rectangular subcarrier pulses. Results show that learned waveforms can achieve lower peak-to-average-power ratio (PAPR) and better spectral containment than CP-OFDM when trained on non-linear amplifier models, at the cost of losing the simple per-subcarrier equalisation that makes OFDM tractable. For 6G Rel-20 and beyond, this remains a research direction: if a future release standardises a new numerology or multi-carrier scheme, AI-designed pulse shapes could inform the design process even if the final standard specifies a fixed waveform.

Deep Unfolding: The Principled Hybrid

Deep unfolding provides a theoretical justification for the hybrid approach. A classical iterative algorithm (e.g., belief propagation for LDPC decoding, ISTA for sparse channel estimation) is unrolled for a fixed number of iterations T, and the algorithm parameters at each iteration are made learnable:

Eq. 11.1 — LISTA Unrolled Iteration
$$\mathbf{x}^{(t+1)} \;=\; \mathcal{S}_{\theta_t}\!\!\left( \mathbf{A}^{(t)}\,\mathbf{x}^{(t)} \;+\; \mathbf{b}^{(t)} \right), \quad t = 0,\ldots,T-1$$

where 𝒮θ is a learned shrinkage / activation function and A(t), b(t) are iteration-specific weight matrices. The result is a network that:

Figure 10-1 — BLER vs. SNR: Classical, Hybrid AI, and E2E Autoencoder

Study checkpoint — §10.
  1. Draw the autoencoder transceiver block diagram and label the encoder, channel model, and decoder. Identify which blocks contain trainable parameters.
  2. Write the E2E training loss and explain why cross-entropy is the appropriate objective for constellation design.
  3. A Rayleigh fading channel causes random amplitude variations. Which constellation geometry does E2E learning converge to, and why?
  4. Explain the four domain-shift mitigation strategies. Which is most practical for a 6G base station that cannot communicate training labels back to the UE?
  5. Why is a fully E2E transceiver not standardisable in 3GPP? Describe the hybrid approach adopted in Rel-18 and identify two receiver blocks where AI is currently permitted.
Standardisation tension — deep dive. The 3GPP standardisation model depends on deterministic specifications: given an input stimulus, all compliant implementations must produce equivalent outputs (within conformance margins). A fully E2E system cannot have a standard — by definition, each trained model is unique: same architecture, different random seed → different constellation, different decoder boundary. 3GPP resolves this by standardising the input/output interface and model metadata format (input dimension, output dimension, quantisation, maximum latency), not the model weights. Weights are treated as proprietary implementation details, much like ASIC microarchitecture today. This allows competitive AI implementations from multiple vendors while maintaining over-the-air interoperability at the waveform level. The open question for Rel-20 is how to handle AI models that continue adapting post-certification.
[14] T. O'Shea and J. Hoydis, "An Introduction to Deep Learning for the Physical Layer," IEEE Transactions on Cognitive Communications and Networking, vol. 3, no. 4, pp. 563–575, Dec. 2017.
[15] 3GPP TR 22.874 v18.0.0, §6.3 — AI/ML model lifecycle: training, transfer, deployment, and update, 3GPP SA1, 2023.

E2E learning sets the vision for AI-native transceivers; deploying these models at scale requires a well-defined infrastructure. → §11 covers the AI/ML functional architecture — how 3GPP and O-RAN specify the training, inference, and model lifecycle management framework for production 6G networks.

§11 — AI/ML Architecture for 6G Networks

The evolution from 5G to 6G is not merely a capacity upgrade — it represents a fundamental architectural shift in which Artificial Intelligence and Machine Learning become first-class citizens of the radio standard. Where 5G treated AI as an optional optimisation bolt-on, 6G embeds AI/ML as a core functional layer: every major plane — RAN, core, OAM, UE — carries standardised AI service models, training pipelines, and model lifecycle interfaces.

§11.1 — The AI/ML Functional Architecture (TR 23.700-80)

3GPP SA2 Release 18 captured the baseline in TR 23.700-80 v18.0.0. The central construct is the AI/ML Network Function (AI/MLNF), a logical entity that may be instantiated at four physical loci:

Deployment Loci
  • OAM (Management Plane) — global model training, policy authoring, network-wide analytics
  • gNB / O-DU / O-CU — RAN-side inference for beam management, scheduler, link adaptation
  • UE — on-device inference for channel estimation, CSI compression, positioning
  • Edge / MEC Server — latency-sensitive tasks offloaded from UE; split-inference endpoint
AI/MLNF Sub-Functions
  • Model Training — supervised, unsupervised, or RL, operating on collected measurements
  • Model Storage — versioned repository with metadata (scenario, SNR range, channel model)
  • Model Inference — real-time application of a trained model to live inputs
  • Model Performance Monitoring — statistical drift detection, accuracy tracking
  • Model Transfer — delivery of model weights to the inference endpoint (O1/E2/Uu)

Standardised Interfaces Carrying AI/ML Traffic

InterfaceEndpointsAI/ML RoleTypical Latency Budget
O1OAM → gNB / O-DUModel delivery, KPM collection, performance report uploadseconds–minutes (management plane)
E2Near-RT RIC → gNBxApp policy enforcement, per-UE/per-cell inference outputs10 ms – 1 s
A1Non-RT RIC → Near-RT RICAI policy objectives, enrichment information1 s – 1 min
R1rApp ↔ Non-RT RICrApp service registration, ML model lifecycle APIseconds
Uu / PC5gNB/eNB ↔ UEOn-device model delivery, feature upload (split inference)sub-10 ms (6G target)
The key insight of TR 23.700-80 is separation of training from inference. A model trained in the OAM cloud can be transferred over O1 and deployed for real-time inference at the O-DU — bridging cloud-scale training with near-real-time RAN operation.

§11.2 — AI/ML Data Pipeline

A production 6G AI pipeline is a closed loop, not a one-shot batch process. The canonical stages are:

  1. Data Collection — measurements harvested from UE, gNB, core
  2. Pre-processing — normalisation, feature engineering, outlier removal
  3. Training — model fitting (see modes below)
  4. Validation — held-out dataset evaluation; acceptance criteria check
  5. Deployment — model transfer to inference endpoint
  6. Monitoring — drift and accuracy tracking in production
  7. Re-training Trigger — if drift > threshold, restart from step 1

Data Sources

UE-Side Measurements
  • RSRP, RSRQ, SINR per serving + neighbour cells
  • CQI, RI, PMI feedback (CSI-RS based)
  • Timing advance (positioning proxy)
  • UE velocity estimate (Doppler / accelerometer)
  • Battery / compute state (for offloading decisions)
Network-Side Measurements
  • Per-PRB utilisation and interference maps
  • Beam RSRP / SINR per beam index (NR SSB / CSI-RS)
  • Handover success/failure rates, call drop rate
  • Core: session throughput, mobility patterns, slice SLA compliance
  • External context: weather (mmWave rain fade), crowd density, time-of-day

Training Modes

ModeMechanismTypical Use CaseLatency to Update
Offline BatchPeriodic retraining on accumulated historical data (daily/weekly cron)Long-term mobility prediction, network capacity planningHours–days
Online LearningContinuous gradient updates from streaming real-time dataAdaptive MCS selection, real-time interference mapSeconds–minutes
Transfer LearningAdapt a pre-trained base model to a new cell or environment via fine-tuningRapid deployment in newly deployed cell sitesMinutes (few-shot)
Active LearningModel queries for labels on high-uncertainty samples; reduces labelling effortAnomaly detection with scarce labelled faultsDepends on labelling loop
Analogy: Think of the 6G AI pipeline as a weather forecasting system. Raw sensor readings (UE measurements) are ingested continuously. Models are retrained nightly on the full history. Forecasts (inference outputs) are served in real time. When the model starts predicting rain on every sunny day (drift), retraining is triggered automatically.

§11.3 — Model Lifecycle Management

3GPP TR 22.874 §6 codifies model metadata as a standard object, ensuring interoperability between vendor training systems and operator inference endpoints. A model package includes:

Model Drift Detection

Drift is detected by comparing the current output distribution against the training-time distribution using KL divergence. Formally, for model output distribution p:

Eq. 11.2 — KL Drift Detection Criterion
$$\text{drift}(t) = D_{KL}\!\left(p_{\text{current}}(y \mid x) \;\Big\|\; p_{\text{train}}(y \mid x)\right) > \varepsilon$$

When drift(t) > ε, the model lifecycle manager either: (a) initiates a retraining job with fresh data, or (b) rolls back to the previous version (fallback), while retraining proceeds asynchronously.

Model States (FSM)
[Training] → PASS validation → [Deployed]
[Deployed] → drift > ε → [Retraining]
[Deployed] → hard failure → [Fallback → prev version]
[Retraining] → complete → [Validation] → [Deployed]
[Validation] → FAIL → [Discarded / alert]
Pitfall — Covariate vs. Concept Drift: KL divergence on outputs detects concept drift (relationship between input and output changes). Covariate drift (input distribution shifts but output mapping is still valid) requires monitoring the input feature distribution separately and does NOT always require retraining.

§11.4 — O-RAN AI/ML Architecture

The O-RAN Alliance WG2 specification O-RAN.WG2.AI-ML-v01.03 defines a three-layer AI/ML hierarchy aligned with control-loop latency:

LayerComponentGranularityAI/ML Functions
Non-RT RIC rApps > 1 s Global optimisation, policy authoring, model training, enrichment information
Near-RT RIC xApps 10 ms – 1 s Per-UE/per-cell beam decisions, fast interference management, admission control
O-DU / O-RU PHY-embedded < 1 ms Symbol-level beam tracking, low-latency channel estimation, CSI compression

Inference Deployment Options — Comparison

Location Latency Data freshness Model complexity Example use case
On-device (UE) < 1 ms Highest — live per-slot inputs Low (< 1 MB, < 107 FLOPs) On-device channel estimation, CSI compression, positioning
Near-RT RIC (xApp) 10 ms – 1 s High — per-UE/cell KPMs streamed over E2 Medium (10–100 MB, runs on x86/GPU server) Per-UE beam decisions, fast interference management, admission control
Non-RT RIC (rApp) 1 s – 1 min Moderate — aggregated historical KPMs High (100 MB+, cloud-class training and inference) Global network optimisation, model training, policy authoring, enrichment
Cloud / MEC Variable (10 ms – minutes) Low — batch or periodic data uploads Very high (multi-GB models, large-batch training) Federated learning aggregation, digital twin training, long-horizon planning

E2 Service Models Relevant to AI

A critical design principle in O-RAN AI: the E2 node retains authority. xApp decisions are policies, not commands. The gNB scheduler may override an xApp beam suggestion if it conflicts with a hard constraint (e.g., RLF imminent). This ensures safety and avoids single-point-of-failure from a misbehaving AI model.
Near-RT RIC latency floor — the beam management gap: The E2 control loop has a defined minimum latency of 10 ms (O-RAN.WG2 specification). This is sufficient for traffic steering and interference management but is too slow for beam management in millimetre-wave deployments, where beam failure recovery must complete within 3–5 ms (3GPP beam failure recovery timer T304 minimum). This creates an architectural gap: beam decisions that require AI inference cannot be handled by Near-RT RIC xApps at the required speed. Two solutions are under study:
  1. CU-level AI: embed a lightweight NN beam predictor directly in the gNB CU (not in the RIC), operating on per-slot CSI within the CU's own processing pipeline at sub-1 ms latency.
  2. Real-Time RIC (RT-RIC): O-RAN WG2 has begun studying a “Real-Time RIC” tier with <10 ms inference loop, deployed co-located with the O-DU to reduce transport latency. This is not yet standardised as of O-RAN Release 4.
For time-critical AI use cases (beam management, URLLC scheduling), the deployment architecture must place inference at the O-DU or O-RU — not in the centralised Near-RT RIC.

O-RAN AI Data Flow

Loading chart…
Study note: The O-RAN AI loop has a well-defined separation of concerns: KPM SM = data ingestion → Non-RT RIC = training → A1 = policy push → Near-RT RIC = fast inference → RC SM = action → O-DU execution. Memorise this chain: it appears in every 6G AI architecture exam question.
[16] 3GPP TR 23.700-80 v18.0.0 — Study on AI/ML architecture enhancements for 5GS.
[17] O-RAN.WG2.AI-ML-v01.03 — AI/ML Workflow Description and Requirements, O-RAN Alliance WG2, 2023.

The centralized AI/ML pipeline described here sets the stage for distributed training. → §12 covers Federated Learning and Split Inference — the mechanisms that scale this architecture while preserving UE privacy.

§12 — Federated Learning & Split Inference

Two of the most consequential AI techniques for 6G are Federated Learning (FL) and Split Inference (SI). FL addresses the fundamental tension between AI's hunger for data and users' privacy rights. SI addresses the mismatch between UE compute budgets and the inference complexity demanded by 6G channel models. Together they define a practical architecture for distributed AI at the network edge.

§12.1 — Why Federated Learning for 6G

Privacy Constraint

UE measurement sequences encode location trajectories, daily routines, social-graph patterns, and health-correlated mobility. Uploading raw measurements to a central server violates GDPR / regional privacy law and undermines user trust. 6G must achieve AI performance without centralising personal data.

Communication Constraint

A fleet of 106 IoT/UE devices each generating 1 MB/day of channel measurements produces 1 TB/day to be uploaded. At typical uplink rates this is simply impractical. Federated Learning shifts the bandwidth requirement from raw data to compressed model gradients — a reduction of 100× or more.

The Federated Learning Solution

In FL, training data never leaves the device. Each participating UE:

  1. Receives the current global model weights θ(t) from the server
  2. Runs several epochs of local SGD on its private dataset
  3. Uploads only the model gradient (or weight delta) — not raw data
  4. Server aggregates contributions via the FedAvg algorithm

The aggregation step computes a weighted average over K participating UEs:

Eq. 12.1 — FedAvg Global Aggregation
$$\theta^{(t+1)} = \sum_{k=1}^{K} \frac{n_k}{N}\, \theta_k^{(t+1)}, \qquad N = \sum_{k=1}^{K} n_k$$

where θk(t+1) is UE k's local model after E local SGD epochs on its nk-sample dataset, and N is the total number of training samples across all participating UEs.

Analogy: Each UE is a chef who keeps their recipe secret but shares only a numerical "taste profile" vector. The kitchen (server) combines taste profiles from all chefs to improve the universal recipe — without ever seeing any individual's recipe or ingredients.

§12.2 — FedAvg and Communication Efficiency

FedAvg (McMahan et al., 2017 [18]) is the foundational FL algorithm. Its key parameters are:

ParameterSymbolTypical 6G ValueEffect
Fraction of UEs per roundC0.01 – 0.1Higher C → faster convergence, more uplink load
Local epochsE1 – 5Higher E → fewer rounds, but client drift risk
Local mini-batch sizeB16 – 128Smaller B → more stochastic, better generalisation
Global roundsT50 – 500Convergence typically within 100–200 rounds for channel tasks

Raw Communication Cost

Without compression, each FL round transfers the full model in both directions:

Compression Techniques

TechniqueMechanismCompression RatioAccuracy Impact
Gradient Sparsification Transmit only top-1% largest-magnitude gradient entries; accumulate the rest locally (error feedback) ~100× < 1% accuracy loss with error feedback
Quantisation Reduce gradient precision from 32-bit float to 8-bit int (QSGD) or 4-bit (1-bit extreme) 4× – 32× 1–3% accuracy loss at 4-bit; negligible at 8-bit
Low-Rank Approximation Decompose gradient matrix: G ≈ UVT, transmit U, V separately 10× – 50× Depends on intrinsic rank of gradient
Local Differential Privacy Add calibrated Gaussian noise before upload (privacy guarantee (ε,δ)-DP per Dwork et al., 2014; Laplacian noise for pure ε-DP) 0× (privacy not compression) Noise σ ∝ sensitivity/ε; accuracy–privacy tradeoff; (ε,δ)-DP allows tighter noise at cost of δ failure probability
Client Drift in Non-IID Data: 6G UE data is highly non-IID (each UE sees its own local channel statistics). With large E and non-IID data, FedAvg suffers client drift — local models diverge so far from the global optimum that aggregation degrades performance. FedProx adds a proximal term μ/2 · ||θ − θ(t)||2 to each local objective, penalising excessive deviation from the global model.

§12.3 — 6G Federated Channel Estimation

Channel estimation is an ideal FL application: each UE accumulates its own channel measurement history (pilot observations vs. true channel), which is deeply personal (encodes the UE's physical location and environment) yet highly informative for a local model.

Architecture

  1. Global phase: FL rounds train a shared base model capturing universal channel statistics (power delay profile shape, angular spread statistics). This uses pilot-to-channel pairs from all participating UEs.
  2. Personalisation phase: After FL convergence, each UE fine-tunes its local copy on its own data for 10–20 additional epochs. The fine-tuned model captures that UE's specific multipath environment (e.g., reflections from a specific building along its daily commute).

Key Results (Literature)

MethodNMSE (dB)Raw Data SharedRounds to Converge
Centralised training (oracle)−12.3All (100%)N/A
Local-only (no federation)−8.7NoneN/A (local)
FedAvg (E=1)−11.4None120
FedAvg + personalisation−11.8None120 + 15 local
FedProx + personalisation−11.9None100 + 15 local
Federated channel estimation achieves within 0.5 dB NMSE of fully centralised training, while sharing zero raw measurements. The personalisation step recovers the residual gap by adapting to individual UE channel conditions.

§12.4 — Split Inference

When a UE lacks the compute budget to run a full inference model locally, the neural network is partitioned across the UE and an edge server. This is termed split inference (also: collaborative inference, device-edge co-inference). It is standardised in 3GPP TR 22.874 §5.3 as a dedicated 6G use case.

Split Point Optimisation

Let layer k be the split point. The total latency is:

Eq. 12.2 — Split Inference Total Latency
$$T_{\text{total}} = \underbrace{T_{\text{compute}}(1:k)}_{\text{UE-side layers}} + \underbrace{T_{\text{comm}}\!\left(\text{feature size at layer }k\right)}_{\text{feature upload}} + \underbrace{T_{\text{compute}}(k+1:L)}_{\text{edge-side layers}}$$

The optimal k minimises Ttotal subject to the 6G 1 ms E2E latency target. Practical constraints:

6G mmWave Scenario Analysis

ScenarioUplink RateFeature Budget (0.2 ms)Suitable Split
6G mmWave (100 GHz)10 Gbps250 KBMid-network (layer 6–8 of 12)
6G sub-7 GHz1 Gbps25 KBEarly split (layer 2–3)
5G NR (FR2)500 Mbps12.5 KBVery early split (layer 1–2)
IoT / NTN link10 Mbps0.25 KBFull local inference or cloud only
TR 22.874 §5.3 — Split AI Inference Standard

3GPP standardises what is communicated at the split point: the feature tensor format (shape, dtype, compression codec), the inference session ID (to correlate partial results across the split), and the QoS profile (latency class, reliability class). This enables multi-vendor split inference: a UE splitting inference to an edge MEC node, coordinated over a standard 6G Uu interface.

§12.5 — Knowledge Distillation for Model Compression

Pre-training large teacher models in the cloud and distilling them into compact student models for UE deployment is a key enabler of AI at the UE. Knowledge distillation (Hinton et al., 2015) trains the student to mimic the teacher's soft output distribution, not just the hard class labels.

KD Loss Function

Eq. 12.3 — Knowledge Distillation Loss
$$\mathcal{L}_{KD} = (1 - \alpha)\,\mathcal{L}_{CE}\!\left(y,\, \sigma(z_s)\right) + \alpha\, T^2\,\mathcal{L}_{CE}\!\left(\sigma\!\left(\tfrac{z_t}{T}\right),\, \sigma\!\left(\tfrac{z_s}{T}\right)\right)$$

where:

Why Temperature T Matters

At T = 1 (standard softmax), a confident teacher assigns probability ≈ 0.99 to the correct class and < 0.01 to all others — nearly indistinguishable from a hard one-hot label. At T = 4, the distribution is much softer (e.g., 0.6 / 0.2 / 0.1 / …), which encodes rich similarity structure. The student learns not just "class A is correct" but "class A is most likely, class B is somewhat similar, class C is dissimilar." This structured information transfer enables strong generalisation even from a tiny student.

ModelParametersSizeInference Latency (UE)NMSE / Accuracy
Teacher (cloud)10 M40 MBNot deployed on UEBaseline (100%)
Student (no KD)1 M4 MB2.1 ms−4.2% accuracy
Student (KD T=4, α=0.7)1 M4 MB2.1 ms−1.8% accuracy
Student (KD + quantisation 8-bit)1 M1 MB0.9 ms−2.1% accuracy
Combining knowledge distillation with 8-bit post-training quantisation achieves a 40× reduction in model size (40 MB → 1 MB) with only < 2.1% accuracy loss — well within the 6G performance budget for on-device AI.

FL + KD Synergy

A powerful 6G architecture combines both techniques in a two-stage pipeline:

  1. Stage 1 — Federated Teacher Training: A large teacher model is trained via FL across a fleet of high-capability edge nodes (MEC servers, O-DU co-processors). No raw UE data leaves the edge nodes.
  2. Stage 2 — KD to UE Student: The federated teacher is used to distil a UE-deployable student model. The student is distributed to UEs over O1/Uu as a standard model package (per TR 22.874 model metadata format).

This pipeline provides: (a) privacy preservation via FL, (b) UE deployability via KD compression, (c) personalisation via post-deployment fine-tuning — the complete 6G on-device AI stack.

FL Convergence Comparison

Loading chart…
Study note: For exam questions on FL in 6G, the four key points are: (1) FedAvg formula — weighted average over nk/N, (2) communication overhead and compression options, (3) non-IID client drift and FedProx remedy, (4) personalisation via local fine-tuning as the last step. Split inference adds: optimal split point = minimise Tcompute + Tcomm + Tcompute.
[18] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. y Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” AISTATS, 2017.
[19] 3GPP TR 22.874 v18.0.0, §5.3 — Split AI inference use case, 3GPP, 2022.

Federated learning and split inference define how AI models are trained and executed across the network. The quality of these models depends critically on the accuracy of the channel models used for training and evaluation. → §13 surveys AI-based channel modeling — from generative models and neural ray tracing to digital twins.

§13  AI for 6G Channel Modeling

§13.1 — Why AI for Channel Modeling?

Classical 3GPP channel models defined in TR 38.901 (CDL / TDL families) are parameterized stochastic models: a fixed cluster-ray structure with empirically fitted path-loss, shadow-fading, and angular-spread parameters. They have served 5G NR design well, but they carry structural assumptions that break down for the scenarios envisioned in 6G.

Limitations of classical TR 38.901 models
  • Fixed cluster / ray count — cannot capture site-specific geometry (building material, street canyons, vegetation).
  • Parameterized up to 100 GHz — no validated model for sub-THz (100–300 GHz) where molecular absorption dominates.
  • Stationary clusters — inadequate for RIS-assisted links where the scattering environment is actively re-configured per slot.
  • Far-field plane-wave assumption — fails for Large Intelligent Surfaces (LIS) and Holographic MIMO operating in the near-field Fresnel region.
  • No coupling with ISAC — radar sensing path and communication path share the same environment but are modelled independently.
  • Static obstacle model — cannot represent human body blockage dynamics or moving vehicles in XR / V2X scenarios.

6G New Channel Modeling Requirements

Frequency extension
  • D-band: 110–170 GHz
  • G-band: 140–220 GHz; 275–300 GHz under study
  • Molecular absorption peaks at 60, 119, 183, 325 GHz
  • Near-field threshold < 10 m for 10 cm apertures
New propagation phenomena
  • RIS phase-reconfigurable scattering
  • Spatially non-stationary channels (Holographic MIMO)
  • High-Doppler: V > 500 km/h (HST), fD > 5 kHz
  • Joint radar-communication bistatic paths (ISAC)
  • Orbital angular momentum (OAM) mode coupling
Key insight: AI-based channel models learn the underlying physics implicitly from measurement data or high-fidelity ray tracing, bypassing the need for closed-form statistical parameterization. This makes them naturally extensible to new frequencies and environments.

§13.2 — Generative AI Channel Models

Two generative architectures dominate current AI channel modeling research: Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). Both learn the statistical distribution of measured channels and can synthesize unlimited additional samples matching that distribution.

GAN-based Channel Generation

A GAN for channel impulse response synthesis consists of two competing networks trained adversarially:

Eq. 13.1 — GAN Channel Generator Objective
$$\min_G \max_D \; V(G,D) = \mathbb{E}_{\mathbf{H} \sim p_{\mathrm{data}}}[\log D(\mathbf{H})] \;+\; \mathbb{E}_{z \sim p_z}[\log(1 - D(G(z)))]$$
Practical GAN design choices
  • Conditional GAN (cGAN): condition on SNR, delay spread, Doppler class to generate scenario-specific channels.
  • Wasserstein GAN (WGAN-GP): gradient penalty replaces original discriminator clipping; more stable training.
  • Progressive growing: generate low-resolution channel (few subcarriers) first, then upscale — avoids mode collapse.
Empirical results
  • GAN-generated channels for CDL-A/B/C match power-delay profile within 0.5 dB across all delays.
  • Channel estimators trained on GAN data generalize 15% better than those trained on CDL-only data when deployed on real measured channels.
  • Latency: trained GAN generates 104 channel realizations in < 1 s (vs hours of ray tracing).

VAE-based Channel Compression and Representation

A Variational Autoencoder learns a low-dimensional latent representation of the channel, useful both for data augmentation and for compressed CSI feedback (replacing traditional codebooks).

Eq. 13.2 — VAE Channel Model ELBO
$$\mathcal{L}_{\mathrm{VAE}} = \underbrace{\mathbb{E}_{q_\phi(z|\mathbf{H})}[\log p_\theta(\mathbf{H}|z)]}_{\text{reconstruction}} \;-\; \underbrace{D_{\mathrm{KL}}\!\left(q_\phi(z|\mathbf{H}) \;\|\; \mathcal{N}(0,I)\right)}_{\text{regularization}}$$
Compression ratio: A typical 4×4 MIMO channel over 132 PRBs has 4 × 4 × 132 × 2 = 4,224 real-valued scalars. A 32-dimensional latent code achieves 132× compression while maintaining < 2 dB NMSE degradation at SNR = 10 dB.

Joint VAE + CsiNet design: VAE encoder runs at UE (compressed feedback), VAE decoder runs at gNB (channel reconstruction) — a principled replacement for Type II CSI codebooks.
Analogy: Think of the VAE latent space as a "channel passport" — a compact identity card that uniquely describes the channel's propagation geometry. The passport is small enough to transmit on a control channel, yet contains enough information for the gNB to reconstruct the full channel for beamforming.

§13.3 — Neural Ray Tracing

Classical deterministic ray tracing (RT) solves Maxwell's equations geometrically: launch rays from transmitter, trace reflections / diffractions / transmissions through a 3D CAD environment model, collect at receiver. Accuracy is excellent (< 1 dB path loss error in validated scenarios), but computational cost is extreme — hours per scenario per frequency.

Classical RT bottlenecks
  • Ray-object intersection: O(Nobjects) per ray bounce.
  • Diffraction: UTD coefficients at wedge edges — expensive for dense urban geometry.
  • Sub-THz: many more dominant rays (specular reflection dominates); need 10–20 bounces for accuracy.
  • Moving scenes: must re-run per snapshot (not real-time).
Neural RT (NeRF-inspired)
  • Represent propagation environment as implicit neural field: fθ(x,y,z,d̂) → {attenuation, phase delay}.
  • Query: parameterize a ray as a sequence of 3D points along path; integrate neural field outputs → channel coefficient.
  • Training: reconstruct scene from sparse channel measurements (few hundred snapshots suffice).
  • 10,000× faster than classical RT post-training.
Neural RT architecture
Input: (x, y, z, d̂in, d̂out) — 3D point + incoming direction + outgoing direction (6D).
Network: 8-layer MLP, 256 units/layer, ReLU activations, skip connection at layer 4 (same as original NeRF).
Output: {σextinction ∈ ℝ, φdelay ∈ ℝ, polarization matrix T ∈ ℂ2×2}.
Channel coefficient: integrate along ray path using numerical quadrature.

Performance vs classical RT: path loss within 2 dB, delay spread within 10 ns, angle spread within 3°. Inference: 1 ms/link.
Why NeRF works for RF: NeRF was designed for visual light (420–700 THz) rendering — a far-field, ray-optical regime. Sub-6 GHz channels involve diffraction and creeping waves that violate ray-optics. At sub-THz (D-band), however, channels ARE nearly ray-optical (wavelength ≈ 2 mm, objects >> λ), making neural RT highly accurate.

§13.4 — Digital Twin Channel Model

A Digital Twin (DT) of the radio environment combines a real-time 3D geometric replica of the physical environment (buildings, furniture, vehicles) with a neural RT engine to generate site-specific, time-varying channel predictions.

DT Architecture

  1. Sensing layer: LiDAR, RGB-D cameras, satellite imagery, plus GPS/IMU for moving objects (vehicles, pedestrians) — continuously updates 3D scene model.
  2. Neural scene representation: environment encoded as a neural implicit field (neural RT model, Sec. 13.3) — updated incrementally as scene changes.
  3. Channel prediction engine: given UE trajectory prediction, generate channel H(t+δ) before the UE arrives.
  4. AI training sandbox: AI/ML models for scheduling, beamforming, and estimation are trained in DT and then hot-deployed to live network.
Eq. 13.3 — Digital Twin Channel Prediction
$$\hat{\mathbf{H}}(t+\delta) = f_{\mathrm{DT}}\!\left(\mathrm{environment}(t+\delta),\; \mathrm{UE\_pos}(t+\delta)\right)$$

where environment(t+δ) is predicted from sensor fusion + object tracking, and UE\_pos(t+δ) is the output of an AI trajectory predictor (LSTM or Kalman filter).

DT use cases in 6G
  • Proactive handover: predict link quality 500 ms ahead → zero-interruption HO.
  • Beam pre-computation: compute optimal beam before UE moves into new sector.
  • RIS configuration: pre-compute RIS phase profile for next time slot (RIS has no feedback channel).
  • AI model transfer: train on DT, fine-tune on live network with minimal over-the-air overhead.
3GPP DTN standardization status
  • SA2 RP-220162: Study on Digital Twin Network (DTN) — architecture and information model.
  • TS 28.535: Management of network slices and digital twins.
  • RAN discussions: DT channel for AI training data collection under TR 38.843 Rel-18 scope.
  • 6G phase: DT elevated from study to normative in projected Rel-21 (2028+).
3GPP SA2 RP-220162 — Digital Twin Network Study (2022)

§13.5 — Sub-THz Channel Characteristics

Sub-THz bands (100–300 GHz) targeted for 6G backhaul, indoor XR, and sensing exhibit propagation physics qualitatively different from 5G mmWave. AI channel models must be calibrated to these effects.

Path Loss at Sub-THz

Eq. 13.4 — THz Path-Loss Model
$$\mathrm{PL}(d,f) = \underbrace{20\log_{10}\!\left(\frac{4\pi d f}{c}\right)}_{\text{free-space (FSPL)}} \;+\; \underbrace{\alpha_{\mathrm{mol}}(f) \cdot d}_{\text{molecular absorption}}$$

where αmol(f) is the molecular absorption coefficient (dB/km), with sharp resonance peaks at approximately 60, 119, 183, and 325 GHz due to O2 and H2O rotational transitions. At 140 GHz (a relatively clear window), αmol ≈ 3–5 dB/km in typical atmosphere.

Key Sub-THz Channel Parameters

Parameter 5G mmWave (28 GHz) 6G D-band (140 GHz) Notes
Coherence bandwidth Bc ~200 MHz ~1 GHz Bc ∝ 1/στ; fewer clusters at THz → larger Bc
Near-field threshold dNF 2D²/λ = 0.5 m (D=5 cm) 2D²/λ = 10 m (D=10 cm) Most indoor links are near-field at D-band!
Reflection loss (concrete) ~5 dB/bounce ~15 dB/bounce Quasi-optical — 1–2 bounces max
Human body blockage ~20 dB ~35 dB Near-complete link outage without diversity
Max practical range (indoor) ~50 m ~10–20 m Absorption + high reflection loss limit range
RMS delay spread στ 50–100 ns 10–20 ns Fewer multipath components; sparser channel
Near-field near-field! For a 10 cm antenna aperture at 140 GHz (λ = 2.14 mm): dNF = 2D²/λ = 2 × (0.1)² / 0.00214 ≈ 9.3 m. Any indoor link shorter than ~10 m operates in the Fresnel (near-field) region, where plane-wave assumptions break down and AI models must account for spherical wavefront curvature.
Analogy — sub-THz propagation. Sub-THz propagation is like trying to shout through a rain curtain: at 140 GHz, even a thin layer of water vapour absorbs significant energy. A fixed parametric model (such as the ITU-R P.676 standard atmosphere) captures average absorption but cannot represent the spatial and temporal variability of real environments — moving people, open windows, air-conditioning drafts. AI-based channel models must learn these environment-specific absorption patterns directly from measurement data rather than fitting a fixed parametric model, making them far more accurate in the site-specific conditions that determine actual link budgets.

AI Modeling Strategies for Sub-THz

Sparsity exploitation
  • Sub-THz channels are inherently sparse in the delay-angle domain (few scatterers survive high reflection losses).
  • Compressed Sensing + DNN: use ℓ1-regularized sparse recovery as a physics-informed layer within the neural estimator.
  • Result: 30% better NMSE than dense-network estimators by exploiting sparsity structure.
Near-field beam modeling
  • Far-field: array response vector a(θ) = phase shifts only.
  • Near-field: a(r,θ) = phase + amplitude taper (spherical wavefront).
  • Polar domain transform: replace DFT with polar Fourier transform — creates sparse near-field representation.
  • AI channel estimator trained in polar domain: 5 dB gain over FFT-domain estimator at d < dNF.
[20] 3GPP TR 38.901 v17.0.0 — Study on channel model for frequencies from 0.5 to 100 GHz.
[21] ITU-R M.2412-0 — Guidelines for evaluation of radio interface technologies for IMT-2020.
Research direction — AI-based massive MIMO antenna calibration: In real O-RAN deployments (split 7-2x), the O-RU antenna array requires RF calibration to ensure that each transmit/receive path has a consistent phase and amplitude response. Imperfect calibration — arising from component tolerances, temperature drift, and connector aging — introduces unknown per-antenna phase offsets that degrade beamforming accuracy, effectively rotating the beam away from the intended direction. Classical calibration procedures (mutual coupling or over-the-air reference signals) are periodic and coarse. AI-based calibration is an emerging research topic: a lightweight NN trained on received pilot signal statistics can continuously estimate and compensate per-antenna calibration errors in the background, without requiring a dedicated calibration downtime. This is particularly relevant for 6G massive MIMO panels at 28/39 GHz where thermal effects are more pronounced and array sizes (256T256R) make per-element calibration hardware-intensive.
TR 38.901 v17 GAN / VAE Neural Ray Tracing Digital Twin Sub-THz D-band Near-field Modeling AI Antenna Calibration

Accurate channel models are the foundation for rigorous AI/ML evaluation. → §14 covers the KPI framework and evaluation methodology that 3GPP TR 38.843 defines for measuring AI performance gains against these channel models.

§14  Performance Evaluation KPIs

3GPP Rel-18 TR 38.843 defines the formal evaluation methodology for AI/ML use cases in NR. It establishes quantitative KPIs, reference scenarios, baseline algorithms, and calibration procedures — the evidentiary standard that AI proposals must meet for normative adoption. KPI evaluation is inherently tied to the channel model used; → §13 provides the AI-based channel modeling context that underpins the scenario definitions here.

§14.1 — AI/ML KPI Framework (3GPP TR 38.843 §7)

Channel Estimation KPIs

NMSE (Normalized Mean Square Error) — primary accuracy metric:

Eq. 14.1 — NMSE KPI Definition
$$\mathrm{NMSE} = \frac{\mathbb{E}\!\left[\|\mathbf{H} - \hat{\mathbf{H}}\|_F^2\right]}{\mathbb{E}\!\left[\|\mathbf{H}\|_F^2\right]}$$

Expressed in dB. Lower is better. Genie-aided MMSE (oracle channel statistics) sets the theoretical lower bound; AI model target is within 1 dB of this bound.

Supporting KPIs
  • BER / BLER: end-to-end link performance (MCS-specific) with AI estimator active — captures the impact on real throughput.
  • Inference latency: time from last pilot symbol to H̃ available at receiver (ms) — must fit within L1 processing timeline (< 1 ms).
  • Model size: KB of weights stored at UE — drives flash/SRAM cost.
  • Pilot overhead: AI estimators may require additional pilots (must be accounted for in spectral efficiency).

CSI Feedback KPIs

Beam Management KPIs

Positioning KPIs

§14.2 — Evaluation Scenarios and Baselines

TR 38.843 specifies a two-tier evaluation structure: (1) a calibration tier using CDL/TDL channels to verify baseline compliance, and (2) a performance tier using scenario-specific channels to demonstrate net gain over existing 5G mechanisms.

Use Case Primary Metric 5G Baseline 6G-AI Target Spec Reference
Channel Estimation NMSE (dB) MMSE: −15 dB @ SNR = 5 dB −18 dB @ SNR = 5 dB TR 38.843 §7.1
CSI Feedback Feedback bits/SB Type II: 16 bit/subband 8 bit/subband (AI) TR 38.843 §7.2
Beam Management Top-1 accuracy @ 120 km/h P2 sweep: 65% 85% TR 38.843 §7.3
Positioning HPE 90%-ile (m) DL-TDOA: 1.2 m 0.3 m TR 38.843 §7.4
Energy Efficiency Network power saving SON-based: 20% ML-SON: 40% TS 28.310
Scheduling 5th-percentile UE rate Proportional Fair: baseline DRL scheduler: +15% TR 37.817
Evaluation trap — cherry-picking SNR: Some AI channel estimator papers only report NMSE at high SNR (e.g., 20 dB) where pilot noise is negligible and classical MMSE also performs well. 3GPP mandates full SNR range evaluation (typically −10 to +30 dB), with particular emphasis on low SNR (0–10 dB) where the gains of learned estimators over LS are most contested.

Reference System Configuration

TR 38.843 Reference System (μ = 1, FR1)
Subcarrier spacing: 30 kHz  |  Bandwidth: 100 MHz  |  PRBs: 132  |  gNB: 32 TRX (8H × 4V × 2pol)  |  UE: 2 RX antennas
Carrier: 3.5 GHz  |  PDSCH: DMRS Type 1, 1 symbol  |  MCS: QPSK–256QAM adaptive
Channel: CDL-A (NLOS), CDL-C (NLOS), CDL-D (LOS)  |  Evaluation: 104 channel realizations minimum per Monte Carlo point
Note: 6G evaluations will extend to FR3 (7–24 GHz) and sub-THz bands.

§14.3 — Evaluation Datasets

3GPP endorsed several open datasets for AI model training, validation, and benchmarking — moving toward reproducible, community-standard evaluation (analogous to ImageNet for computer vision).

DeepMIMO
  • Ray-tracing based using Wireless InSite (commercial RT tool).
  • Scenarios: O1 (outdoor-to-indoor), I3 (indoor office), O1_28B (28 GHz outdoor).
  • Configurable: frequency, antenna size, UE grid resolution.
  • Open source; widely used in 3GPP evaluations.
  • deepmimo.net

QuaDRiGa
  • Quadrature Dual-Reflector Antenna model — geometry-based stochastic (GBSM) with spatially consistent mobility.
  • Reference implementation used in several 3GPP TR 36.873 and TR 38.901 simulations.
  • Handles base station cooperation (multi-cell), elevation, and 3D mobility trajectories.
COST 2100
  • Multi-link MIMO channel dataset from European COST 2100 action.
  • Measured scenarios: indoor hall, semi-urban outdoor.
  • Publicly available; captures multi-user interference structure.
  • Used for massive MIMO precoder learning benchmarks.

Raymobtime
  • Outdoor V2X ray tracing with realistic vehicular mobility (SUMO traffic simulator + Wireless InSite).
  • Temporal sequences of channels along vehicle trajectories — critical for high-Doppler AI model evaluation.
  • Available in multiple frequency bands (5.9 GHz, 28 GHz, 60 GHz).
Minimum evaluation protocol — TR 38.843 §7 compliance. A valid performance claim under TR 38.843 §7 must report all five of the following dimensions; single-point claims lacking any element are not 3GPP-compliant:
  1. NMSE or BER vs. SNR curves — evaluated across the full operating SNR range (typically −10 to +30 dB), not a single operating point.
  2. Complexity in FLOPs and inference latency — reported for the target hardware tier (UE or gNB), demonstrating L1 timing budget compliance (< 1 ms).
  3. Overhead in bits — additional pilot or feedback overhead relative to the 5G NR baseline must be explicitly accounted for in the spectral efficiency comparison.
  4. Two deployment scenarios — at minimum one LOS (e.g., CDL-D) and one NLOS (e.g., CDL-A or CDL-C) scenario; results on a single channel type are insufficient.
  5. Comparison against classical baselines — both LS and MMSE (or equivalent for the use case) must be evaluated on identical channels and SNR points.

Calibration Procedure (CDL channels)

Before scenario-specific evaluation, AI models undergo a CDL calibration:

  1. Train model on CDL-A with v = 3 km/h (near-static, low Doppler).
  2. Evaluate on CDL-A/B/C/D/E at SNR ∈ {−10, 0, 5, 10, 20, 30} dB — verifies that model does not overfit to training channel type.
  3. Evaluate on CDL-A with v = 30, 120, 500 km/h — verifies Doppler robustness (or explicit retraining per velocity class).
  4. Compare NMSE vs MMSE (genie) and LS baselines on identical channels.
  5. Report: NMSE gap to MMSE at SNR = 5 dB (key operating point for coverage-limited UEs).
Doppler calibration point (CDL-A, v = 120 km/h): At 10 GHz carrier (6G FR3), v = 120 km/h gives fD,max = v·fc/c = (120/3.6) × 1010 / 3×1081,111 Hz. With μ = 1 (30 kHz SCS), a 14-symbol slot spans 0.5 ms, during which the channel rotates by fD·Tslot ≈ 0.56 — significant inter-symbol channel variation. AI estimators with temporal interpolation (LSTM across OFDM symbols) are critical here.

§14.4 — Overhead and Complexity KPIs

AI/ML inference in real-time L1 processing imposes concrete hardware constraints. 3GPP TR 38.843 §7.5 defines complexity KPIs that prevent academically compelling but practically infeasible models from entering the specification.

Complexity KPI Typical Range (AI models) Practical Target Comparator
FLOPs per inference 106 – 108 < 107 for UE LDPC decoding: ~107 FLOPs
Inference latency 0.1 – 5 ms < 1 ms (L1 budget) OFDM symbol duration: 35.7 μs (μ=1)
Model size (weights) 10 KB – 10 MB < 1 MB UE, < 10 MB gNB LTE turbo code LUT: ~100 KB
Memory bandwidth 1–100 MB/inference < 10 MB/inference L2 cache hit rate critical for latency
Update frequency per-slot / per-frame / offline per-frame or slower for UE Model update signaling via RRC
Online training cost N/A (frozen) or incremental No online training at UE Gradient compute = 3× forward pass

FLOPs Analysis — ChannelNet Family

The ChannelNet family of channel estimators (ReEsNet, CsiNet, InterpolateNet variants) spans a wide complexity-accuracy trade-off:

Low-complexity tier (< 106 FLOPs)
  • LS + 1-layer CNN: LS estimate → 1 conv layer denoiser. NMSE: −11 dB @ SNR = 5 dB. Near LS performance, minimal gain.
  • CHEST-DNN: fully connected, 3 layers, 128 units. NMSE: −13 dB. Fast but misses spatial structure.
Mid-complexity tier (106 – 107 FLOPs)
  • ReEsNet (ResNet-based): 5 residual blocks, 3×3 conv. NMSE: −16 dB @ SNR = 5 dB. Near MMSE at 5.2×106 FLOPs.
  • InterpolateNet: LS on pilots → learned 2D interpolation kernel. NMSE: −15.5 dB.
High-complexity tier (> 107 FLOPs)
  • Transformer estimator: attention over OFDM subcarriers. NMSE: −17.5 dB. Computationally heavy; requires hardware accelerator.
  • OAMP-Net: deep unrolled OAMP (approximate message passing). NMSE: −18 dB. Converges in 5 iterations; each iteration ~2×106 FLOPs.
Pareto-efficient design principle
  • Diminishing returns above 107 FLOPs — transformer gains < 1 dB over ReEsNet at 5× cost.
  • Architecture-aware quantization: INT8 inference reduces FLOPs by ~4× with < 0.3 dB NMSE loss.
  • Pruning: 70% weight pruning with retraining recovers 95% of original NMSE — reduces model size without accuracy loss.
  • Optimal operating point for UE: ~5×106 FLOPs (ReEsNet-class) at < 500 μs latency on mobile SoC.

Model Lifecycle and Signaling Overhead

Beyond per-inference complexity, 3GPP must standardize the model lifecycle: how AI models are delivered to UE, updated, and version-managed:

  1. Model delivery: baseline model pre-loaded in UE firmware; delta updates via RRC (broadcast or dedicated) — target < 50 KB per update.
  2. Model activation: gNB signals which model ID to activate per cell / per UE via MAC-CE or RRC reconfiguration.
  3. Model monitoring: UE measures KPI metric (e.g., BLER) and reports to gNB when AI model performance degrades — triggers model switch or fallback to classical algorithm.
  4. Fallback: UE MUST always support classical baseline (LS, MMSE, P2 sweep) — AI model is an optional enhancement layer, not a replacement of baseline functionality.
Study focus — TR 38.843 open issues (Rel-18 closure):
  • Model ID space: how many bits for model ID in MAC-CE? (Proposal: 4 bits → 16 models per cell.)
  • Generalization specification: should standard mandate minimum cross-scenario NMSE (e.g., train on CDL-A, test on CDL-C must be within X dB)?
  • Online adaptation: allowed at gNB side only (for Rel-18); UE-side online training deferred to Rel-19/20.
  • Split inference: part of inference at UE (feature extraction), remainder at gNB — reduces UE complexity but introduces air-interface latency for intermediate tensor transmission.
[22] 3GPP TR 38.843 v18.0.0, §7 — Study on artificial intelligence (AI) and machine learning (ML) for NR air interface: evaluation methodology.
TR 38.843 §7 NMSE / BLER KPIs DeepMIMO / QuaDRiGa FLOPs Budget Pareto Frontier Model Lifecycle

Even with rigorous KPIs, significant challenges stand between current AI/ML results and live network deployment. → §15 examines the open research challenges — generalization, standardization gaps, computational constraints, and security — that must be resolved on the path to 6G.

15
Open Research Challenges

Despite the substantial progress documented in §§3–14, the path from promising research results to production-grade 6G AI/ML remains blocked by a set of fundamental open problems. This section catalogues the five most critical challenge domains, examines their technical depth, and summarises the mitigation strategies currently under evaluation in 3GPP, O-RAN Alliance, and the broader academic community. None of these problems is fully solved; each represents an active frontier whose resolution will determine the pace and scope of AI integration in commercial networks.

15.1 The Generalization Problem

The single most consequential obstacle to deploying AI at the wireless physical layer is the generalization gap: AI/ML models trained on simulated or laboratory channels routinely underperform when transferred to real deployment sites. The gap has been reproduced across use-cases — channel estimation, CSI compression, beam prediction — and across hardware platforms, confirming that it is a structural property of how models are trained, not an artifact of any single implementation.

Taxonomy of generalization failure modes

  1. Distribution shift: The joint distribution P(x, y) differs between the training environment (CDL-A with TDL-A Doppler) and the deployment site (dense urban canyon with NLOS clusters). The model has never seen the deployment distribution and cannot extrapolate reliably.
  2. Covariate shift: The marginal input distribution P(x) changes while the conditional P(y|x) is unchanged. Examples include new UE types with different RF chain characteristics, or a new mobility pattern (e-scooter vs. pedestrian). Models that relied on implicit priors over P(x) fail silently.
  3. Concept drift: The underlying generative process changes over time. Seasonal foliage alters multipath delay spreads by 5–15 ns; new building construction introduces permanent scatterer clusters; infrastructure upgrades change antenna geometry. A static trained model degrades monotonically until retrained.

Quantifying the gap

Classical statistical learning theory provides a useful bound. Given a hypothesis class with VC dimension dVC and ntrain i.i.d. training samples, the generalization error is bounded with high probability by:

Eq. 15.1 — VC-Dimension Generalisation Bound
$$\text{Generalization error} = \left| L_{\text{test}} - L_{\text{train}} \right| \;\leq\; \mathcal{O}\!\left(\sqrt{\frac{d_{\text{VC}} \log(n_{\text{train}}/d_{\text{VC}})}{n_{\text{train}}}}\right)$$

For deep networks, dVC scales with the number of parameters (often millions), so this bound is vacuous without domain-specific inductive biases. In practice, NN channel estimators trained exclusively on CDL-A have been shown to lose 3–5 dB NMSE when evaluated on measured urban channels — a loss that completely eliminates the gain over LS estimation that motivated the AI approach in the first place.

3GPP flag: "The generalization problem is the #1 blocker for AI deployment in real networks. 3GPP TR 38.843 explicitly flagged model generalization across deployment scenarios as an open issue requiring further study in Release 19." The specification requires that any standardised AI procedure must demonstrate robustness across at least three CDL channel families and two deployment scenarios (UMa, UMi) before being adopted as a normative procedure.

Mitigation strategies under study

Strategy Description Status in 3GPP
Domain adaptation Fine-tune a pre-trained model on a small set of real-world samples collected at the deployment site (100–1000 pilots). Transfer learning dramatically reduces the number of site-specific samples required. 3GPP AI use-case UC1 (channel estimation) considers online adaptation as a key feature of the Rel-19 framework. Active — TR 38.843 Rel-19
Domain randomization Train on a wide distribution of channel conditions — spanning CDL-A/B/C, multiple delay spreads, Doppler spreads, and site-specific ray-tracing augmentation. The model is forced to learn invariant features rather than distribution-specific shortcuts. Cost: larger models and longer training time. Research phase
Causal learning Instead of learning P(y|x) from observational data, learn the causal structure of the wireless channel: which physical phenomena (reflection, diffraction, scattering) cause which channel impulse response features. Causal models are by construction more robust to distribution shift because the underlying physics does not change. Research phase
Physics-informed NNs (PINNs) Embed Maxwell's equations or simplified propagation models as soft constraints in the loss function. The NN is free to learn from data but is penalised for solutions that violate physical laws. Demonstrated 1.8 dB NMSE improvement on out-of-distribution channels vs. unconstrained NNs in recent academic results. Pre-standardization

15.2 The Standardization Gap

AI improves performance in simulation, but improving performance is not sufficient for standardization: 3GPP standards must guarantee interoperability between equipment from different vendors. AI introduces new sources of non-interoperability that have no precedent in the existing specification framework.

Root causes of non-interoperability

  1. Non-deterministic output: Two UEs running nominally the same AI model on different hardware (different NPU architectures, different floating-point rounding) may produce subtly different CSI feedback vectors. The gNB's decoder, optimized for a specific encoder, may fail to reconstruct the channel accurately from the "wrong" encoder's output. This breaks the fundamental assumption of 3GPP CSI standardization, where a known codebook guarantees decoder-side reconstruction.
  2. Model versioning: A gNB deployed by Vendor A ships a specific encoder model. A UE deployed by Vendor B ships a different encoder trained on different data. Even if both claim "3GPP Rel-18 AI CSI", they are not interoperable. The standard currently has no mechanism for model identity negotiation.
  3. Inference endpoint negotiation: For CSI feedback, should the encoder run at the UE (on-device inference, low latency) or at the gNB (model fully known, no interoperability issue)? For channel estimation, should the model run at the UE receiver or be provided as a service by the network? These architectural questions remain open.
  4. Graceful fallback: When the AI procedure fails — due to model mismatch, distribution shift, hardware fault, or deliberate misconfiguration — what is the standardized fallback? Current proposals suggest reverting to legacy (non-AI) procedures, but the triggering condition and transition mechanism are unstandardized.

3GPP working items addressing the gap

Document Scope Status (2024)
TR 22.874 §7 AI/ML model transfer and lifecycle management requirements Complete (Rel-18)
TR 38.843 §8 Standardization impact analysis for UC1/UC2/UC3 Complete (Rel-18)
SA2 WI (Rel-19) AI model metadata format, versioning, fallback procedures In progress
RAN1 WI (Rel-19) Normative CSI feedback AI encoder/decoder interface Study phase
SA1 Release 19 AI model management work item: 3GPP SA1 Release 19 includes a dedicated work item on AI/ML model management that aims to define standard metadata formats, versioning schemes, and fallback mechanisms. This is expected to be the foundation for fully standardized AI procedures in 6G — analogous to how RRC Connection Reconfiguration standardized handover state machines in 4G/5G.

The open question that will define the architecture of 6G AI: should 3GPP standardize the interface (encoder output format, as in today's codebook feedback), the training procedure (dataset, loss function, evaluation metric), or the model itself (fixed binary weights delivered over OTA update)? Each choice has profoundly different implications for vendor differentiation, update agility, and certification burden.

15.3 Computational and Energy Constraints

AI inference at the PHY layer (L1) is not a background task — it sits on the critical timing path. A 5G NR slot is 0.5–1 ms (depending on numerology), and channel estimation or CSI feedback must complete within a fraction of that budget. This imposes hard real-time constraints on AI inference that are fundamentally different from the datacenter inference workloads for which most deep learning hardware is designed.

Compute requirements

Consider a representative NN channel estimator: 3 dense layers, 200K parameters, Float32 arithmetic. Per-inference floating-point operations:

Eq. 15.2 — Inference FLOP Budget
$$\text{FLOP}_{\text{inference}} \approx 2 \times N_{\text{params}} = 4 \times 10^5 \;\text{FLOP}$$

At a slot duration of 0.5 ms and assuming inference must complete in 50% of slot time:

Eq. 15.3 — Required Throughput for Inference
$$\text{Required throughput} = \frac{4 \times 10^5 \;\text{FLOP}}{0.25 \times 10^{-3} \;\text{s}} = 1.6 \;\text{GFLOPS}$$

This is well within current UE SoC capability (representative mobile SoC: 2–5 TOPS). However, the situation is more demanding for larger models or when multiple UEs share a gNB processing pool: with 256 simultaneous UE streams, gNB-side inference demand reaches 400+ GFLOPS, requiring dedicated AI accelerators.

Energy cost at the UE

Current NPU efficiency (2024 generation): ~1 TOPS/W. For a 200K-parameter NN:

  • Per-inference energy: ~0.4 mJ
  • At 2000 inferences/s (one per slot): 800 mW
  • UE power budget for radio: ~500–1000 mW total

This means AI inference at full rate consumes a significant fraction of the entire UE radio power budget — before accounting for RF chain, baseband DSP, and application processor. Models must be aggressively compressed.

Compression techniques:

  • Quantization (INT4/INT8): 4–8× reduction in compute and memory; typically <2% accuracy loss for well-calibrated models
  • Pruning: Remove >90% of weights with <1 dB NMSE penalty for structured channel estimators
  • Knowledge distillation: Train a small "student" model to mimic a large "teacher"; student can be 10× smaller with 80% of the performance gain
  • Early exit: 60% of inputs (easy channel conditions) exit at layer 3; only hard cases traverse full depth — reduces average compute by 2–3×
Key insight: The energy bottleneck is more constraining than the compute bottleneck. Hardware-software co-design — where the model architecture is designed with the NPU's specific dataflow in mind — can recover 2–4× efficiency vs. architecture-agnostic design. This argues for standardizing model architectures (not just interfaces) to enable hardware-specific optimization across the vendor ecosystem.

15.4 Privacy and Security

Federated learning and AI-assisted network operation introduce attack surfaces that have no analogue in classical wireless system design. The wireless channel itself can be used as both an attack vector and a side-channel for exfiltrating model information.

Federated learning threat model

Attack type Mechanism Impact Primary defence
Gradient inversion Reconstruct private training data from shared gradient updates User location, movement pattern, device identity leakage Differential privacy (add Gaussian noise to gradients)
Model poisoning Malicious UEs submit adversarial gradients to corrupt global model Degraded global model performance, targeted misclassification Byzantine-robust aggregation (Krum, coordinate-wise median)
Free-rider attack UE downloads global model without contributing genuine updates Unfair resource consumption; model quality degradation over time Contribution verification via secure aggregation
Membership inference Determine whether a specific UE's data was used in training Privacy violation; regulatory non-compliance (GDPR) (ε, δ)-differential privacy guarantee

Differential privacy tradeoff

Adding Gaussian noise with variance σ² to each gradient provides (ε, δ)-DP with:

Eq. 16.1 — Differential Privacy Noise Budget
$$\varepsilon \approx \frac{\sqrt{2 \ln(1.25/\delta)} \cdot \Delta f}{\sigma}$$

where Δf is the L2 sensitivity of the gradient. Practical challenge: the noise level required for strong privacy (ε < 1) degrades model accuracy by 5–15% compared to non-private training. The privacy–utility tradeoff remains an open research problem in the wireless FL context, where gradient dimension is high (104–106) and UE participation is sparse.

AI model security beyond FL

Adversarial examples — formal definition

The Fast Gradient Sign Method (FGSM) generates adversarial inputs by perturbing in the direction of the loss gradient:

Eq. 15.4 — FGSM Adversarial Perturbation (Goodfellow et al. 2014)
$$x_{\text{adv}} = x + \varepsilon_{\text{adv}} \cdot \operatorname{sign}\!\left(\nabla_x \mathcal{L}(\theta, x, y)\right)$$

where εadv is the adversarial perturbation budget (L∞ norm); this symbol is distinct from the DP privacy budget ε used earlier in this section. For beam management AI, x corresponds to the pilot measurement vector and the attacker uses RIS phase control to inject the perturbation over the air.

Security research gap: There is currently no 3GPP security analysis for AI-specific attack vectors at the air interface. 3GPP SA3 has begun preliminary discussion (TS 33.501 annex), but normative threat modelling for AI PHY procedures is not yet scoped for any release. This gap must be closed before AI becomes a normative L1 function.

15.5 Regulatory and Spectrum Considerations

AI-learned waveforms and constellations — particularly those generated end-to-end by an autoencoder — present a novel regulatory challenge. Regulatory bodies worldwide grant spectrum licenses for defined transmission formats; an AI that autonomously generates a new waveform may operate in a regulatory grey area even if its physical emission envelope complies with existing masks.

Core regulatory constraints

3GPP approach: constrained AI output

The working consensus in 3GPP RAN1 is to constrain AI outputs to comply with existing spectral emission masks rather than seeking new regulatory approvals. This means:

  1. AI models that generate waveform parameters (e.g., constellation symbols, precoding vectors) must have their output projected onto a feasible set defined by the RF mask.
  2. AI-controlled transmit power must remain bounded by existing maximum power levels (TS 38.101).
  3. AI beam management must not generate beams directed outside the equipment's regulatory geographic area.

These constraints limit some of the theoretically achievable performance gains from unconstrained AI but are necessary for regulatory tractability. The interpretability challenge — demonstrating to regulators that constrained AI reliably stays within bounds — remains an open problem.

Challenge maturity landscape

[23] 3GPP TR 38.843 v18.0.0, §8 — Standardization impact analysis for AI/ML NR air interface use cases.
[24] 3GPP TR 22.874 v18.0.0, §7 — AI/ML model management requirements, lifecycle, and transfer procedures.
16
Conclusion & Outlook

This whitepaper has surveyed the state of AI/ML integration across every major function of the 5G NR air interface and charted the trajectory toward a 6G that is AI-native by design. We close with a structured summary of achievements, a realistic timeline to 2030, and a perspective on what the transition to AI-native mobile networks means at the level of standardization methodology and systems engineering practice.

16.1 Summary of Achievements and 3GPP Status

The table below consolidates the key findings across §§3–14, mapping each AI/ML domain to its 3GPP standardization status, the primary specification reference, and the headline performance gain demonstrated over legacy (non-AI) baselines in 3GPP-defined evaluation scenarios.

Domain 3GPP Status Key Specification Performance Gain
Channel Estimation AI Study complete (Rel-18) TR 38.843 UC1 +2–4 dB NMSE over LS/MMSE baselines
CSI Feedback AI Study complete (Rel-18) TR 38.843 UC2 50% overhead reduction at equal reconstruction quality
Beam Management AI Study complete (Rel-18) TR 38.843 UC3 85% top-1 accuracy; 30% beam sweeping reduction
Positioning AI Under study (Rel-19) TR 38.843 UC4 <30 cm indoor (vs. <1 m for legacy RSTD)
Energy Efficiency AI Ongoing (TS 28.310) TR 37.816 40% RAN energy saving in low-load scenarios
Semantic Comms Research phase (pre-normative) TR 22.874 10–100× effective bandwidth for task-oriented data
Federated Learning Study item (SA2) TR 23.700-80 Privacy-preserving model training without raw data sharing
6G AI-Native Vision document (Rel-20+) SP-221500 Full AI integration as first-class PHY/MAC function

The arc is clear: AI began as an external optimisation layer applied to fixed 5G NR procedures (Rel-18 study items), is moving toward standardised interfaces and model management (Rel-19 work items), and will eventually become the primary design paradigm for the air interface itself in 6G. Each stage builds on the infrastructure — data collection, model transfer, monitoring — laid in the previous stage.

Architecture insight: The O-RAN Alliance's non-RT and near-RT RIC framework has proven to be the most tractable path for deploying AI in live networks without disrupting standardised interfaces. The rApp/xApp model — where AI runs as an application above a standardised A1/E2 interface — allows AI to be deployed, updated, and rolled back independently of the RAN software stack. This architectural pattern is likely to persist into 6G even as AI moves deeper into the PHY layer.

16.2 The Road to 2030

Translating the current research and study-item landscape into a deployment timeline requires accounting for both 3GPP normative timelines and the typical 2–3 year lag between specification completion and commercial network deployment.

Milestone summary

2024 — Release 18 (Rel-18 Frozen): AI plugged into existing 5G NR as optional enhancements. Three use cases (channel estimation, CSI feedback, beam management) are fully studied with evaluation methodology, performance benchmarks, and standardisation impact analysis. No normative AI procedures — legacy codebooks and reference signals remain mandatory fallback.

2025 — Release 19 (Rel-19 Work Items Active): AI-enhanced procedures move from study items to work items. Model management framework (metadata, versioning, lifecycle) standardised. First normative signalling for AI CSI feedback expected. O-RAN WG2 AI/ML workflow specification reaches v2.0 with normative xApp APIs.

2026–2027 — 6G Phase 1 Study / Release 20: 3GPP begins 6G Phase 1 specification (Rel-20 target: ~2027). AI-native air interface design is a primary architectural theme. Key questions: whether channel coding, MIMO precoding, and waveform generation can be partially replaced by end-to-end learned procedures while maintaining spectrum compatibility and regulatory compliance.

2028 — 6G Phase 1 Standard Frozen: First 6G specification with AI as a first-class PHY/MAC function. AI-assisted initial access, channel estimation, and beam management are normative. Legacy procedures remain as a fallback for the transition decade.

2030 — Commercial 6G Deployment: First commercial 6G networks. Fully AI-native operation in new deployments; existing 5G infrastructure upgraded over 5–10 year cycle. AI model management infrastructure (OTA update, performance monitoring, fallback triggering) fully operational in commercial deployments.

Pacing factors

The timeline above is optimistic and depends on resolution of the challenges documented in §15. The two most likely schedule-limiting factors are:

  1. Standardization gap resolution (§15.2): Until the model versioning and fallback problems are solved normatively, AI cannot become a mandatory component of the air interface. Rel-19 model management work items are the critical path.
  2. Generalization validation methodology (§15.1): 3GPP requires performance claims to be verified against a defined set of evaluation scenarios. For AI, this requires agreement on training datasets, evaluation channel models, and performance metrics. Reaching this consensus is likely to take 2–3 years of normative debate.

16.3 Final Perspective

The fundamental shift in 6G: The fundamental shift in 6G is not faster speeds — it is the shift from model-driven to data-driven network design. For 50 years, wireless systems were designed by deriving mathematical models of the channel and designing algorithms — channel codes, equalizers, beamformers — that are optimal for those models. AI inverts this: instead of designing algorithms for models, we learn algorithms from data. This requires new standardization methodologies: instead of specifying algorithms, 3GPP must specify training procedures, evaluation datasets, and interface semantics. The community is navigating this transition in real time.
Historical analogy: 6G with AI is to 5G what the transition from analog to digital was to mobile networks in the 1980s–90s. The first digital cellular systems (GSM, IS-95) did not merely improve on analog — they changed the fundamental design paradigm: from analog signal processing to algebraic coding theory, from circuit-switched voice to packet data, from analog RF to software-defined baseband. The transition was disruptive, took 15 years to fully play out, and produced a completely different vendor and standards ecosystem. The AI transition in 6G is structurally similar: the question is not whether AI will be central to 6G — it already is in the study items — but how quickly the ecosystem can standardize, test, and deploy AI models at the speed and reliability demanded by a global communications infrastructure. History suggests the answer is: slower than the research community hopes, faster than incumbent incumbents fear.

The body of work surveyed in this whitepaper — from §3's foundations in channel estimation through §14's vision of AI-native 6G — represents only the first chapter of a long story. The study items of Release 18 will be remembered, in retrospect, the way we remember the first digital modems: as the moment when the trajectory changed, even if the full transformation was still decades away.

What is clear today is that the wireless industry has made an irreversible commitment. The investment in AI/ML standardisation infrastructure — the data collection frameworks (TR 37.817), the model management architectures (TR 22.874), the evaluation methodologies (TR 38.843) — creates institutional momentum that will carry AI-native design principles into every generation of wireless standards from this point forward. The research challenges of §15 are formidable but finite; the direction of the field is not in question.

Authoritative framework references: The overarching 6G vision and AI/ML integration objectives documented throughout this whitepaper are grounded in two foundational sources: ITU-R M.2160 (12/2023) — Framework and overall objectives of the future development of IMT for 2030 and beyond — which establishes the 6G service requirements and capability targets that motivate AI-native design; and 3GPP TR 38.843 v18.0.0 — Study on Artificial Intelligence (AI) and Machine Learning (ML) for NR Air Interface — which defines the evaluation methodology, use-case scope, and standardisation impact analysis that govern AI integration through Rel-18 and Rel-19.
REF
References
  1. [1] ITU-R M.2160 (12/2023) — Framework and overall objectives of IMT for 2030 and beyond
  2. [2] ITU-R M.2150 (02/2022) — Detailed specifications of the terrestrial radio interfaces of IMT-2020
  3. [3] 3GPP TR 38.843 v18.0.0 — Study on AI/ML for NR Air Interface (Rel-18), 2024
  4. [4] 3GPP TR 22.874 v18.0.0 — Study on traffic characteristics and performance requirements for AI/ML model transfer
  5. [5] 3GPP TR 23.700-80 v18.0.0 — Study on AI/ML architecture enhancements
  6. [6] 3GPP TR 38.901 v17.0.0 — Study on channel model for frequencies from 0.5 to 100 GHz
  7. [7] 3GPP TS 28.310 v18.0.0 — Energy efficiency of 5G
  8. [8] 3GPP TR 37.817 v17.1.0 — Study on enhancement for data collection for NR and ENDC
  9. [9] 3GPP TR 37.816 v16.0.0 — Study on SON and the O&M aspects
  10. [10] O-RAN.WG2.AI-ML-v01.03 — AI/ML Workflow Description and Requirements
  11. [11] W. Wen et al., "Deep Learning for Massive MIMO CSI Feedback," IEEE WCL, vol. 7, no. 5, 2018
  12. [12] T. O'Shea and J. Hoydis, "An Introduction to Deep Learning for the Physical Layer," IEEE Trans. CogNet., 2017
  13. [13] H. McMahan et al., "Communication-Efficient Learning of Deep Networks from Decentralized Data," AISTATS, 2017
  14. [14] E. Bourtsoulatze et al., "Deep Joint Source-Channel Coding for Wireless Image Transmission," IEEE Trans. CogNet., 2019
  15. [15] 3GPP SP-221500 — 6G Vision Document (3GPP SA Plenary, 2022)
  16. [16] ITU-R IMT-2030 Focus Group Technical Report FG-IMT2030, 2022
  17. [17] 3GPP TR 38.855 v16.0.0 — Study on NR positioning support
  18. [17b] 3GPP TR 38.859 v18.0.0 — Study on Expanded and Improved NR Positioning (Rel-18 AI/ML-enhanced positioning study item)
  19. [18] 3GPP TR 38.874 v16.0.0 — Study on Integrated Access and Backhaul
  20. [19] A. Zappone et al., "Wireless Networks Design in the Era of Deep Learning," IEEE Trans. Commun., 2019
  21. [20] DeepMIMO Dataset — deepmimo.net (A. Alkhateeb et al.)
  22. [21] C. Chaccour et al., "Seven Defining Features of Terahertz (THz) Wireless Systems," IEEE Commun. Surveys Tuts., 2022
  23. [22] M. Chen et al., "A Joint Learning and Communications Framework for Federated Learning over Wireless Networks," IEEE Trans. Wireless Commun., 2021
  24. [23] 3GPP TR 38.843 v18.0.0, §8 — Standardization impact analysis for AI/ML NR air interface use cases
  25. [24] 3GPP TR 22.874 v18.0.0, §7 — AI/ML model management requirements, lifecycle, and transfer procedures

Academic Research References (illustrative examples — not 3GPP-standardised architectures)

The following papers are cited as representative examples of AI/ML techniques that address 3GPP-defined problems. 3GPP does not mandate these specific architectures. Implementors are free to use any architecture that satisfies the 3GPP KPI requirements defined in TR 38.843 §7.
  1. [A1] M. Soltani, V. Pourahmadi, A. Mirzaei, and H. Sheikhzadeh, "Deep Learning-Based Channel Estimation," IEEE Commun. Lett., vol. 23, no. 4, pp. 652–655, Apr. 2019. (Representative DNN-based channel estimator; ChannelNet-class architecture.)
  2. [A2] C.-K. Wen, W.-T. Shih, and S. Jin, "Deep Learning for Massive MIMO CSI Feedback," IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751, Oct. 2018. (CsiNet autoencoder for CSI feedback compression — addresses TR 38.843 UC2.)
  3. [A3] A. Vaswani et al., "Attention Is All You Need," in Advances in Neural Information Processing Systems (NeurIPS), 2017. (Transformer / multi-head attention mechanism underlying TransNet and similar CSI models.)
  4. [A4] E. Bourtsoulatze, D. B. Kurka, and D. Gündüz, "Deep Joint Source-Channel Coding for Wireless Image Transmission," IEEE Trans. Cognit. Commun. Netw., vol. 5, no. 3, pp. 567–579, Sep. 2019. (DeepJSCC — foundational JSCC paper; no cliff effect at low SNR.)
  5. [A5] T. O'Shea and J. Hoydis, "An Introduction to Deep Learning for the Physical Layer," IEEE Trans. Cognit. Commun. Netw., vol. 3, no. 4, pp. 563–575, Dec. 2017. (End-to-end autoencoder transceiver; learned constellations. Not adopted in any 3GPP release.)
  6. [A6] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-Efficient Learning of Deep Networks from Decentralized Data," in Proc. AISTATS, 2017. (FedAvg — canonical federated learning aggregation algorithm.)
  7. [A7] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Smola, and V. Smith, "Federated Optimization in Heterogeneous Networks," in Proc. ICLR, 2020. (FedProx — addresses non-IID client drift in federated settings.)
  8. [A8] V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, pp. 529–533, Feb. 2015. (DQN — deep Q-network algorithm used for energy scheduling examples in §7.)
  9. [A9] R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments," in Advances in NeurIPS, 2017. (MADDPG — CTDE multi-agent RL used for multi-cell scheduling in §8.)
  10. [A10] C. Yu, A. Velu, E. Vinitsky, J. Gao, Y. Wang, A. Bayen, and Y. Wu, "The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games," in Advances in NeurIPS, 2021. (MAPPO — on-policy cooperative MARL; recommended baseline for 6G multi-cell coordination.)
  11. [A11] C. Dwork and A. Roth, "The Algorithmic Foundations of Differential Privacy," Foundations and Trends in Theoretical Computer Science, vol. 9, nos. 3–4, pp. 211–407, 2014. (Foundational DP theory; Gaussian mechanism σ ≥ Δf√(2ln(1.25/δ))/ε.)
  12. [A12] M. Abadi et al., "Deep Learning with Differential Privacy," in Proc. ACM CCS, 2016. (DP-SGD — gradient clipping + Gaussian noise for differentially private federated training.)
  13. [A13] I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and Harnessing Adversarial Examples," in Proc. ICLR, 2015. (FGSM adversarial perturbation; εadv notation in §15.4 refers to perturbation budget, distinct from DP privacy budget ε.)