This paper presents a new approach for improving the efficiency of large VLSI parallel systems called the Dynamic Concatenation Approach (DCA). The basic idea of DCA is to concatenate bit-serial processing elements to construct bit-parallel processors of variable width. The concatenation permits an increase in the storage capacity of each processor which can be used to keep a number of local variables within the processor registers. A model for evaluating the performance speed-ups that could be achieved using DCA is developed. Arbitrary architectural characteristics as well as program characteristics are both taken into account in this model. Speed-up estimates are obtained by considering the effect of DCA when the variables of three programs are allocated to the registers of the processing elements. It is shown that DCA significantly improves performance in systems where I/O bottlenecks exist. © 1992.