The size and complexity of hardware systems motivates the use of simulation for their study and analysis. Parallelization techniques are often employed to meet the memory and computational requirements for simulating large hardware designs. Furthermore, partitioning the design for parallel simulation is vital for achieving acceptable simulation throughput. This paper presents the design and implementation of a new partitioning algorithm based on a multilevel partitioning heuristic. The multilevel algorithm attempts to balance the load, maximize concurrency, and reduce inter-processor communication in three phases to improve performance of the parallel simulator. The paper presents the algorithm and reports results from of our empirical studies to compare performance of the multilevel algorithm with other partitioning techniques. A generic partitioning infrastructure was developed and integrated into the WARPED parallel simulation framework to ease design, implementation, and study of different partitioning algorithms. The design issues involved in the development of the generic partitioning infrastructure are discussed and the techniques employed to integrate several partitioning algorithms into this framework are also presented in the paper. The experimental results obtained from our benchmarks indicate that the multilevel algorithm yields better partitions than other partitioning algorithms included in the study.