Abstract
Most existing studies consider the deep reinforcement learning (DRL) based Q-learning approach due to its ability to quickly converge to a near-optimal solution, resulting in effective allocation of resources and power. DRL-based Q-network discretizes the continuous power values which results in poor performance. It is challenging to allocate resources effectively in fast varying channel conditions in dynamic vehicular environments. In this work, we propose two approaches to overcome these challenges. First, we present a DRL-based energy-efficient resource allocation approach where we use a twin delayed deep deterministic policy gradient (TD3) scheme based on Thompson sampling to solve the power and resource allocation problem. Second, we present a dynamic meta-transfer learning framework to enhance the policy's ability to adjust to new channel conditions. Simulation results shows that the proposed TD3 approach based on Thompson sampling enhances the system performance. Moreover, the proposed DRL-based dynamic meta-transfer learning framework takes 80% less samples to adapt to a new environment.
| Original language | English |
|---|---|
| Pages (from-to) | 4343-4356 |
| Number of pages | 14 |
| Journal | IEEE Transactions on Network and Service Management |
| Volume | 21 |
| Issue number | 4 |
| DOIs | |
| State | Published - 2024 |
| Externally published | Yes |
Keywords
- DRL
- EE
- V2X
- meta-learning
- resource allocation
Fingerprint
Dive into the research topics of 'Energy Efficient Resource Allocation Framework Based on Dynamic Meta-Transfer Learning for V2X Communications'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver