Smart city planning is envisaged as advance technology based independent and autonomous environment enabled by optimal utilisation of resources to meet the short and long run needs of its citizens. It is therefore, preeminent area of research to improve the energy consumption as a potential solution in multi-tier 5G Heterogeneous Networks (HetNets). This article predominantly focuses on energy consumption coupled with CO2 emissions in cellular networks in the context of smart cities. We use Reinforcement Learning (RL) vertical traffic offloading algorithm to optimize energy consumption in Base Stations (BSs) and to reduce carbon footprint by applying widely accepted strategy of cell switching and traffic offloading. The algorithm relies on a macro cell and multiple small cells traffic load information to determine the cell offloading strategy in most energy efficient way while maintaining quality of service demands and fulfilling users applications. Spatio-temporal simulations are performed to determine a cell switch on/off operation and offload strategy using varying traffic conditions in control data separated architecture. The simulation results of the proposed scheme prove to achieve reasonable percentage of energy and CO2 reduction. © 2019 IEEE.